How would you construct this SQL query? (MySQL) - sql

Let assume we have these tables:
product
product_id
product_name
category
category_id
category_name
product_in_category
product_in_category_id
product_id
category_id
how would you get all products that are not in specific category in the product_in_category table (without duplicates).
In other words, all products that are not been assigned to category 10, for instance.
Also, if one product is in categories 1, 5 and 10, it shouldn't come up the result.

Using LEFT JOIN/IS NULL:
SELECT p.*
FROM PRODUCT p
LEFT JOIN PRODUCT_IN_CATEGORY pic ON pic.product_id = p.product_id
AND pic.category_id = 10
WHERE pic.product_in_category_id IS NULL
Using NOT IN
SELECT p.*
FROM PRODUCT p
WHERE p.product_id NOT IN (SELECT pic.product_id
FROM PRODUCT_IN_CATEGORY pic
WHERE pic.category_id = 10)
Using NOT EXISTS
SELECT p.*
FROM PRODUCT p
WHERE NOT EXISTS (SELECT NULL
FROM PRODUCT_IN_CATEGORY pic
WHERE pic.product_id = p.product_id
AND pic.category_id = 10)
Which is best?
It depends on if the columns being compared are nullable (values can be NULL) or not. If they are nullable, then NOT IN/NOT EXISTS are more efficient. If the columns are not nullable, then LEFT JOIN/IS NULL is more efficient (MySQL only).

SELECT * FROM product_in_category WHERE category_id!=10
or am I missing something? I guess I was missing the no duplicates part, look at InSane's better answer.

SELECT product_id FROM product
LEFT OUTER JOIN product_in_category
ON product.product_id = product_in_category.product_id
WHERE product_in_category.product_id IS NULL
GROUP BY product_id

Related

how to convert array of integers after array_agg into values for IN clause

Could please help me, I'm trying resolve this for a quite long time...
I have table Product and RelatedProducts (top level products consist of other base products). Goal: I'd like get all base products.
So, table looks like:
product_id related_product_ids
------------------------------------------------
1143 1213
1255 1245
1261 1229,1239,1309,1237,1305,1243,1143
I've got this by query:
select max(p.id) as product_id, array_to_string(array_agg(p2p.related_product_id), ',') as related_product_ids
from product p
left join product_to_product p2p on p2p.product_id = p.id
where p.id in (select product_id from order_line where wo_id = 262834)
group by p.id, p2p.product_id
I'd like feed related_product_ids into product table to get all related products.
So, actually I made array from all necessary values by running
select array_agg(p2p.related_product_id) as id
from product p
left join product_to_product p2p on p2p.product_id = p.id
where p.id in (select product_id from order_line where wo_id = 262834)
related_product_ids
---------------------------------------------
{1309,1143,1229,1239,1243,1237,1305,1245,1213}
I tried, without success, following:
select *
from product
where id = ANY(select array_agg(p2p.related_product_id) as id
from product p
left join product_to_product p2p on p2p.product_id = p.id
where p.id in (select product_id from order_line where wo_id = 262834))
Error: ERROR: operator does not exist: integer = integer[] Hint: No operator matches the given name and argument type(s). You might need to add explicit type casts. Position: 39, SQLState: 42883, ErrorCode: 0
or following:
select *
from product
where id in (select array_to_string(array_agg(p2p.related_product_id), ',') as id
from product p
left join product_to_product p2p on p2p.product_id = p.id
where p.id in (select product_id from order_line where wo_id = 262834))
Error: ERROR: operator does not exist: integer = integer[] Hint: No operator matches the given name and argument type(s). You might need to add explicit type casts. Position: 36, SQLState: 42883, ErrorCode: 0
and many other tries
So finally what I need is
select *
from product
where id in (1309,1143,1229,1239,1243,1237,1305,1245,1213)
(values from related_product_ids)
How to convert array of integers (related_product_ids) in to values.... Or may be you can suggest different better way?
DBFiddle
If you want to use the result as an array, you can do that with ANY - but the parameter has to be an array as well.
select *
from product
where id = any(array(select p2p.related_product_id
from product p
left join product_to_product p2p on p2p.product_id = p.id
where p.id in (1, 2, 3)))
But I think you are over complicating things. As far as I can tell, this can be simplified to:
select p1.*
from product p1
where exists (select *
from product_to_product p2p
where p2p.related_product_id = p1.id
and p2p.product_id in (1,2,3))
Goal: I'd like get all base products.
If I assume that a "base" product is one that never appears in the related_product_id column, then not exists comes to mind:
select p.*
from product p
where not exists (select 1
from product_to_product p2p
where p2p.related_product_id = p.id
);
I don't know why your =ANY doesn't work, it seems to me like it should. Because a select can theoretically return multiple rows, it treats your array_agg kind of like the inner array of a nested array. The ANY "unnests" the first layer, but still leaves a int[] layer for the = to work with.
But your IN example works if you just get rid of the aggregation:
Since you didn't give create scripts for your tables, I've substituted ones from pgbench so that I could post tested code. The concept should apply back to your tables.
select * from pgbench_accounts where aid in (select bid from pgbench_branches);
Note that ANY also works when you don't aggregate:
select * from pgbench_accounts where aid =ANY (select bid from pgbench_branches);
List and arrays and sets are different things. But they can be used interchangeably in some situations. But I don't how to predict which ones without just trying them.
Error in your DBFiddle example is:
In last query just unnest the array instead of array_to_string
select * from product where id = ANY(select unnest(array_agg(p2p.related_product_id)) as id from product p
left join product_to_product p2p on p2p.product_id = p.id
where p.id in (1, 2, 3))

WHERE Clause for One-To-Many Association

I have two tables Products and ProductProperties.
Products
name - string
description - text
etc etc
ProductProperties
product_id - integer
property_id - integer
There is also a table Properties which basically stores the list of property names and their attributes
How can I implement a SQL command that finds a product with the property_ids (A or B or C) AND (X or Y or Z)
I've got upto here:
SELECT DISTINCT "products".*
FROM "products"
INNER JOIN "product_properties" ON "product_properties"."product_id" = "products"."id" AND "product_properties"."deleted_at" IS NULL
WHERE "products"."deleted_at" IS NULL
AND (product_properties.property_id IN ('504, 506, 403'))
AND (product_properties.property_id IN ('520, 501, 502'))
But it doesn't really work since it's looking for a Product Property which has both values 504 and 520, which will never exist.
Would appreciate some help!
You need to define intermediate resultsets on a property group basis:
SELECT DISTINCT p.*
FROM products p
JOIN product_properties groupA ON groupA.product_id = p.id AND groupA.deleted_at IS NULL AND groupA.property_id IN ('504')
JOIN product_properties groupB ON groupB.product_id = p.id AND groupB.deleted_at IS NULL AND groupB.property_id IN ('520')
WHERE p.deleted_at IS NULL
You see, you detected the problem yourself very nicely: "But it doesn't really work since it's looking for a Product Property which has both values 504 and 520, which will never exist."
Indeed, recordsets are immutable within a query, all single criteria applied to them are applied all at once. You need to duplicate each table and apply individual criteria to them.
One method uses exists or in:
select p.*
from products p
where p.id in (select pp.product_id
from product_properties pp
where pp.propertyid in ('504', '520')
);
This saves you from having to use distinct in the outer query.
If, perchance, you really mean finding the products that have all the properties, then a join and group by work:
select p.*
from products p join
product_properties pp
on p.id = pp.product_id
where pp.propertyid in ('504', '520')
group by p.id -- yes, this is allowed in Postgres
having count(*) = 2;
Hi try this queries i just thinking about it so i didn't try any of them check i got the idea i want to do
SELECT DISTINCT "products".*
FROM products pr
WHERE id IN
(
SELECT product_id FROM ProductProperties WHERE property_id IN (504,520)
GROUP BY product_id
HAVING Count(*) = 2
) AND "products"."deleted_at" IS NULL
SELECT DISTINCT "products".*
FROM products pr, INNER JOIN (
SELECT product_id,count(*) as nbr FROM ProductProperties WHERE property_id IN (504,520)
GROUP BY product_id
) as temp ON temp.product_id = pr.id
WHERE "products"."deleted_at" IS NULL AND temp.nbr = 2
and also you can check this one as well ( you can use also the join in where clause instead of using INNER JOIN)
SELECT DISTINCT products.* FROM products as p
INNER JOIN product_properties as p1 ON p1.product_id = p.id
INNER JOIN product_properties as p2 ON p2.product_id = p.id
WHERE p.deleted_at IS NULL
AND p1.property_id = '504' AND p1.deleted_at IS NULL
AND p2.property_id = '520' AND p2.deleted_at IS NULL

Select rows that don't have a corresponding join in join table

I have two SQL tables - customer and widget. There's a join table, customers_widgets between them, that has two columns (customer_id and widget_id)
Is there a way I can select all the customers that aren't joined to a widget? So they have an id that doesn't appear in the customer_id column on the join table?
In general I've found NOT IN to be expensive and slow, but your mileage may vary on different RDBMS.
The two alternatives that I most often use are:
SELECT
*
FROM
customer
WHERE
NOT EXISTS (SELECT *
FROM customers_widgets
WHERE customers_widgets.customer_id = customer.customer_id
)
And...
SELECT
customer.*
FROM
customer
LEFT JOIN
customers_widgets
ON customers_widgets.customer_id = customer.customer_id
WHERE
customer_widgets.customer_id IS NULL
Try this:
SELECT customer_id
FROM customer
WHERE customer_id NOT IN (SELECT customer_id
FROM customers_widgets)
You can use an OUTER JOIN for this:
Select C.*
From customer C
Left Join customer_widgets W On C.customer_id = W.customer_id
Where W.customer_id Is Null

Many to many query

I have two tables products and sections in a many to many relationship and a join table products_sections. A product can be in one or more sections (new, car, airplane, old).
Products
id name
-----------------
1 something
2 something_else
3 other_thing
Sections
id name
-----------------
1 new
2 car
Products_sections
product_id section_id
--------------------------
1 1
1 2
2 1
3 2
I want to extract all products that are both in the new and the car sections. In this example result returned should be product 1. What is the correct mysql query to obtain this?
SELECT Products.name
FROM Products
WHERE NOT EXISTS (
SELECT id
FROM Sections
WHERE name IN ('new','car')
AND NOT EXISTS (
SELECT *
FROM Products_sections
WHERE Products_sections.section_id = Sections.id
AND Products_sections.product_id = Products.id
)
)
In other words, select those products for which none of the desired Section.id values is missing from the Products_sections table for that product.
Answer andho's comment:
You can put
NOT EXISTS (<select query>)
into a WHERE clause like any other predicate. It will evaluate to TRUE if there are no rows in the result set described by <select query>.
Stepwise, here's how to get to this query as an answer:
Step 1. The requirement is to identify all products that are "in both the 'new' and 'car' sections".
Step 2. A product is in both the 'new' and 'car' sections if both the 'new' and 'car' sections contain the product. Equivalently, a product is in both the 'new' and 'car' sections if neither of those sections fails to contain the product. (Note the double negative: neither fails to contain.) Restated again, we want all the products for which there is no required section failing to contain the product.
The required sections are these:
SELECT id
FROM Sections
WHERE name IN ('new','car')
Therefore, the desired products are these:
SELECT Products.name
FROM Products
WHERE NOT EXISTS ( -- there does not exist
SELECT id -- a section
FROM Sections
WHERE name IN ('new','car') -- that is required
AND (the section identified by Sections.id fails to contain the product identified by Products.id)
)
Step 3. A given section (such as 'new' or 'car') does contain a particular product if there's a row in Products_sections for the given section and particular product. So a given section fails to contain a particular product if there is no such row in Products_sections.
Step 4. If the query below does contain a row, the section_id section does contain the product_id product:
SELECT *
FROM Products_sections
WHERE Products_sections.section_id = Sections.id
AND Products_sections.product_id = Products.id
So the section_id section fails to contain the product (and that's what we need to express) if the query above does not produce a row in its result, or if NOT EXISTS ().
Seems complicated, but once you get it in your head, it sticks: Are all required items present? Yes, so long as there does not exist a required item that is not present.
The way I always do these is this:
Start at what you're trying to get (products), and then go through your lookup table (products_sections) to what you're trying to filter by (sections). This way, you can have it in plain view what you're looking for, and you never have to memorize surrogate keys (which are a great thing to have, not to memorize).
select distinct
p.name
from
products p
inner join products_sections ps on
p.product_id = ps.product_id
inner join sections s1 on
ps.section_id = s1.section_id
inner join sections s2 on
ps.section_id = s2.section_id
where
s1.name = 'new'
and s2.name = 'car'
Voila. Three inner joins, and you have a nice, clear, concise query that is obvious what it's bringing back. Hope this helps!
SELECT product_id, count(*) AS TotalSection
FROM Products_sections
GROUP BY product_id
WHERE section_id IN (1,2)
HAVING TotalSection = 2;
See if this works in mysql.
The query below is a little unwieldy, but it should answer your question:
select products.id
from products
where products.id in
(
select products_sections.product_id
from products_sections
where products_sections.section_id=1
)
and products.id in
(
select products_sections.product_id
from products_sections
where products_sections.section_id=2
)
Self-join on two subsets of join table and then selecting unique product ids.
SELECT DISTINCT car.product_id
FROM ( SELECT product_id
FROM Product_sections
WHERE section_id = 2
) car JOIN
( SELECT product_id
FROM Product_sections
WHERE section_id = 1
) neww
ON (car.product_id = neww.product_id)
This query is a variation of more general solution:
SELECT DISTINCT car.product_id
FROM product_sections car join
product_sections neww ON (car.product_id = neww.product_id AND
car.section_id = 2 AND
neww.section_id = 2)
Less efficient but more straight forward solution is:
SELECT p.name FROM Products p WHERE
EXISTS (SELECT 'found car'
FROM Products_sections ps
WHERE ps.product_id = p.id AND ps.section_id = 2)
AND
EXISTS (SELECT 'found new'
FROM products_sections ps
WHERE ps.product_id = p.id AND ps.section_id = 1)
----------------
I manipulated with ids for clarity. If necessary replace expressions section_id = 2 and section_id = 1 with
section_id = (SELECT s.id FROM Sections s WHERE s.name = 'car')
section_id = (SELECT s.id FROM Sections s WHERE s.name = 'new')
Also, you can select product names by plugging in any of the queries above like this:
SELECT Products.name FROM Products
WHERE EXISTS (
SELECT 'found product'
FROM product_sections car join
product_sections neww ON (car.product_id = neww.product_id AND
car.section_id = 2 AND
neww.section_id = 2)
WHERE car.product_id = Products.id
)
SELECT p.*
FROM Products p
INNER JOIN (SELECT ps.product_id
FROM Products_sections ps
INNER JOIN Sections s
ON s.id = ps.section_id
WHERE s.name IN ("new","car")
GROUP BY ps.product_id
HAVING Count(ps.product_id) = 2) pp
ON p.id = pp.product_id
This query will get you the result without having to add more inner joins when you need to search more sections. What will change here are:
values inside the IN () paranthesis
The value in the where clause for count which should be replaced with the number of sections you are searching
SELECT id, name FROM
(
SELECT
products.id,
products.name,
sections.name AS section_name,
COUNT(*) AS count FROM products
INNER JOIN products_sections
ON products_sections.product_id=products.id
INNER JOIN sections
ON sections.id=products_sections.section_id
WHERE sections.name IN ('car', 'new')
GROUP BY products.id
) AS P
WHERE count = 2
select
`p`.`id`,
`p`.`name`
from `Sections` as `s`
join `Products_sections` as `ps` on `ps`.`section_id` = `s`.`id`
join `Products` as `p` on `p`.`id` = `ps`.`product_id`
where `s`.`id` in ( 1,2 )
having count( distinct `s`.`name` = 2 )
will return...
id name
-----------------
1 something
Is that what you were looking for?

Join two tables where all child records of first table match all child records of second table

I have four tables: Customer, CustomerCategory, Limit, and LimitCategory. A customer can be in multiple categories and a limit can also have multiple categories. I need to write a query that will return the customer name and limit amount where ALL the customers categories match ALL the limit categories.
I'm guessing it would be similar to the answer here, but I can't seem to get it right. Thanks!
Edit - Here's what the tables look like:
tblCustomer
customerId
name
tblCustomerCategory
customerId
categoryId
tblLimit
limitId
limit
tblLimitCategory
limitId
categoryId
I THINK you're looking for:
SELECT *
FROM CustomerCategory
LEFT OUTER JOIN Customer
ON CustomerCategory.CustomerId = Customer.Id
INNER JOIN LimitCategory
ON CustomerCategory.CategoryId = LimitCategory.CategoryId
LEFT OUTER JOIN Limit
ON Limit.Id = LimitCategory.LimitId
Updated!
Thanks to Felix for pointing out a flaw in my existing solution (3 years after I originally posted it, hehe). After looking at it again, I think this might be correct. Here I'm getting (1) the customers and limits with matching categories, plus the number of matching categories, (2) the number of categories per customer, (3) the number of categories per limit, (4) I then ensure the number of categories for customer and limits is the same as the number of the matches between the customers and limits:
UNTESTED!
select
matches.name,
matches.limit
from (
select
c.name,
c.customerId,
l.limit,
l.limitId,
count(*) over(partition by cc.customerId, lc.limitId) as matchCount
from tblCustomer c
join tblCustomerCategory cc on c.customerId = cc.customerId
join tblLimitCategory lc on cc.categoryId = lc.categoryId
join tblLimit l on lc.limitId = l.limitId
) as matches
join (
select
cc.customerId,
count(*) as categoryCount
from tblCustomerCategory cc
group by cc.customerId
) as customerCategories
on matches.customerId = customerCategories.customerId
join (
select
lc.limitId,
count(*) as categoryCount
from tblLimitCategory lc
group by lc.limitId
) as limitCategories
on matches.limitId = limitCategories.limitId
where matches.matchCount = customerCategories.categoryCount
and matches.matchCount = limitCategories.categoryCount
I don't know if this will work or not, just a thought i had and i can't test it, I'm sures theres a nicer way! don't be too harsh :)
SELECT
c.customerId
, l.limitId
FROM
tblCustomer c
CROSS JOIN
tblLimit l
WHERE NOT EXISTS
(
SELECT
lc.limitId
FROM
tblLimitCategory lc
WHERE
lc.limitId = l.id
EXCEPT
SELECT
cc.categoryId
FROM
tblCustomerCategory cc
WHERE
cc.customerId = l.id
)