Optimize JOIN SQL query with additional SELECT - sql

I need a query which will select just one (GROUP BY phi.id_product) image for each product and this image have to be the one with the highest priority (inner SELECT with ORDER BY statement).
The priority is stored in N:M relation table called product_has_image
I've created a query, but it tooks about 3 seconds to execute and I need to optimize it. Here it is:
SELECT p.*, i.id AS imageid
FROM `product` p JOIN `category` c on c.`id` = p.`id_category`
LEFT OUTER JOIN (SELECT id_product, id_image FROM
`product_has_image` ORDER BY priority DESC) phi ON p.id = phi.id_product
LEFT OUTER JOIN `image` i ON phi.id_image = i.id
WHERE (c.`id_parent` = 2 OR c.`id` = 2)
GROUP BY phi.id_product
Indexes which I find to be important in this query are:
image (PRIMARY id)
product_has_image (PRIMARY id_product, id_image; INDEX id_product; INDEX id_image)
product (PRIMARY id, id_category; INDEX id_category)
category (PRIMARY id; INDEX id_parent)
Most of the time takes joining the tables using the SELECT statement which is required for sorting.
Joining with LEFT JOIN [product_has_image] phi ON p.id = phi.id_product is much faster, but doesn't assign the image with the highest priority.
Any help would be appreciated.

Reformatted for sensibility . . .
SELECT p.*, i.id AS imageid
FROM `product` p
INNER JOIN `category` c on (c.`id` = p.`id_category`)
LEFT OUTER JOIN (SELECT id_product, id_image
FROM `product_has_image`
ORDER BY priority DESC) phi
ON (p.id = phi.id_product)
LEFT OUTER JOIN `image` i
ON (phi.id_image = i.id)
WHERE (c.`id_parent` = 2 OR c.`id` = 2)
GROUP BY phi.id_product
Without seeing an execution plan or DDL, I'd guess (shudder) that the problem is likely to be the inner select/sort. If you create a view
create view highest_priority_images as
select id_product, max(priority)
from product_has_image
group by id_product
Then you can replace that inner SELECT...ORDER BY with a SELECT...INNER JOIN on that view. That would reduce the cardinality, so I'd expect it to run faster.
Posting DDL would help.

I would probably try to do it like this:
SELECT p.*, i.id AS imageid
FROM `product` p
INNER JOIN `category` c ON c.id = p.id_category
/* a list of `id_product`s with their highest priorities
from `product_has_image` */
LEFT OUTER JOIN (
SELECT id_product, MAX(priority) AS max_priority
FROM `product_has_image`
GROUP BY id_product
) m ON p.id = m.id_product
/* now joining `product_has_image` again, using
m.`max_priority` for additional filtering */
LEFT OUTER JOIN `product_has_image` phi
ON p.id = phi.id_product AND m.max_priority = phi.priority
/* if you only select `id` from `image`, you can use
phi.`id_image` instead and remove this join */
LEFT OUTER JOIN `image` i ON phi.id_image = i.id
WHERE c.id_parent = 2 OR c.id = 2

Can't test it now, but wouldn't it be possible to do this?
SELECT p.*, i.id AS imageid
FROM `product` p JOIN `category` c on c.`id` = p.`id_category`
LEFT JOIN `product_has_image` phi ON p.id = phi.id_product
LEFT OUTER JOIN `image` i ON phi.id_image = i.id
WHERE (c.`id_parent` = 2 OR c.`id` = 2)
GROUP BY phi.id_product
ORDER BY phi.priority DESC
Do it in a regular join and order by phi.priority.

Related

How to add TOP 1 in query with left join in views?

I have 3 same product in ID=42, with 3 different images. I want to take the first image from the product ID, I try adding "TOP 1", error
This is my query
CREATE OR REPLACE VIEW UserOrdersView
AS
SELECT
u.[User_ID],
p.Product_Name,
p.Price,
o.Order_Price,
o.Order_ID,
i.[Image]
FROM Product p
LEFT JOIN Orders o ON o.Product_ID = p.Product_ID
INNER JOIN Users u ON u.[User_ID]= o.[User_ID]
LEFT JOIN Product_Images i ON i.Product_ID = p.Product_ID
WHERE o.[User_ID] = 42
You need to use OUTER APPLY to get top 1 image data from Product_image table based on Product ID.
Please check this Real life example, when to use OUTER / CROSS APPLY in SQL stackoverflow link for more knowledge.
Please check below updated view code for your answer.
CREATE OR REPLACE VIEW UserOrdersView
AS
BEGIN
SELECT
u.[User_ID],
p.Product_Name,
p.Price,
o.Order_Price,
o.Order_ID,
i.[Image]
FROM Product p
INNER JOIN Users u ON u.[User_ID]= o.[User_ID]
LEFT JOIN Orders o ON o.Product_ID = p.Product_ID
OUTER APPLY
(
SELECT TOP 1
T2.[Image]
FROM Product_Images T2
WHERE T2.Product_ID = p.Product_ID
) i
WHERE o.[User_ID] = 42
END
GO
WITH cte as (
SELECT
u.[User_ID],
p.Product_Name,
p.Price,
o.Order_Price,
o.Order_ID,
i.[Image],
ROW_NUMBER() OVER (PARTITION BY i.[Image] ORDER BY p.Product_Name) AS rn
FROM Product p
LEFT JOIN Orders o ON o.Product_ID = p.Product_ID
INNER JOIN Users u ON u.[User_ID]= o.[User_ID]
LEFT JOIN Product_Images i ON i.Product_ID = p.Product_ID
)
SELECT [User_ID],Product_Name,Price,Order_Price,Order_ID,[Image] FROM cte
WHERE rn=1
Put your all query inside a CTE with a new column that you will use to filter the results.
This new column is produced with ROW_NUMBER() function partitioned by Product_Name

Re-writing query from in() to joins

Can you assist in re-writing this into joins?
select * from users where users.advised_by in (
select p.id
from advisors p
join advisor_members m on p.id = m.advisor_id
join representatives r on m.user_id=r.user_id
where m.memeber_type='Advisor'
)
This is part of 200+ row query and that in() statement is hard to maintain when there are changes.
you should use a proper on clause
select *
from users
inner join
(
select p.id
from advisors p
join advisor_members m on p.id = m.advisor_id
join representatives r on m.user_id=r.user_id
where m.memeber_type='Advisor'
) t on users.advised_by = t.id
/*Option 1 */
SELECT *
FROM users usr
INNER JOIN
(
SELECT p.id AS advisor_id
FROM advisors p
JOIN advisor_members m
ON p.id = m.advisor_id
JOIN representatives r
ON m.user_id=r.user_id
WHERE m.memeber_type='Advisor' ) T2 usr.advised_by = t2.advisor_id
/*Option2 -- */
SELECT *
FROM users usr
INNER JOIN advisors p
ON usr.advised_by=p.id
JOIN
(
SELECT *
FROM advisor_members
WHERE m.memeber_type='Advisor') m
ON p.id = m.advisor_id
JOIN representatives r
ON m.user_id=r.user_id

Join/Subquery is fast/slow depending on which column I filter on (not a simple index issue)

PostgreSQL 9.3.2, compiled by Visual C++ build 1600, 64-bit
Each customer can have many orders and referrals. Now, I want to create a view with some statistics for customer where, for each customer, I have some calculated columns (one row for each customer).
Create the view:
create view myview
select
a.customer_id,
sum(a.num) as num_orders,
sum(b.num) as num_referrals
from
(
select
customer.id as customer_id,
count(customer.id) as num
from
customer
left join
order
on
order.customer_id = customer.id
group by
customer.id
) a
left join
(
select
customer.id as customer_id,
count(customer.id) as num
from
customer
left join
referral
on
referral.customer_id = customer.id
group by
customer.id
) b
on
a.customer_id = b.customer_id
group by
a.customer_id,
b.customer_id
;
Query A (this is fast):
select
customer.*,
myview.*
from
customer
left join
myview
on
customer.id = myview.customer_id
where
customer.id = 100
;
Query B (this is SLOW):
select
customer.*,
myview.*
from
customer
left join
myview
on
customer.id = myview.customer_id
where
customer.sex = 'M'
;
Query C (this is fast):
select
customer.*,
myview.*
from
customer
left join
myview
on
customer.id = myview.customer_id
where
(select id from customer where sex = 'M')
;
OK, so why is Query B so much different than Query A in terms of performance? I guess, in Query B, it is running those subqueries first without filtering, but I don't know how to fix it.
The problem is that it is our ORM that is generating the query. So, I can't fix the problem by doing something like Query C.
I'm hoping there's just a better way to design my view to fix the problem. The main difference in the EXPLAIN results between Query A and Query B is that Query B has some MERGE RIGHT JOIN operations.
Any ideas?
EDIT:
I added following information per requests from people commenting. The following is the more true-to-life info (as opposed to the simplified, hypothetical scenario above).
create or replace view myview as
select
a.id_worder,
count(a.*) as num_finance_allocations,
count(b.*) as num_task_allocations
from
(
select
woi.id_worder,
count(*) as num
from
worder_invoice woi
left join
worder_finance_task ct
on
ct.id_worder_finance = woi.id
left join
worder_finance_task_allocation cta
on
cta.id_worder_finance_task = ct.id
group by
woi.id_worder
) a
left join
(
select
wot.id_worder,
count(*) as num
from
worder_task wot
left join
worder_task_allocation wota
on
wota.id_worder_task = wot.id
group by
wot.id_worder
) b
on
a.id_worder = b.id_worder
group by
a.id_worder,
b.id_worder
;
Query A (fast, apparently I need a rep of more than 10 to post more than 2 links, so no EXPLAIN for this one)
select
*
from
worder a
left outer join
myview b
on
a.id = b.id_worder
where
a.id = 100
;
Query B (SLOW, EXPLAIN)
select
*
from
worder a
left outer join
myview b
on
a.id = b.id_worder
where
a.id_customer = 200
Query C (fast, EXPLAIN)
select
*
from
worder a
left outer join
myview b
on
a.id = b.id_worder
where
a.id = (select id from worder where id_customer = 200)
;
Try rewriting your view like so:
create view myview
select
c.customer_id,
(
select count(*) from order o where o.customer_id=c.customer_id
) num_orders,
(
select count(*) from referral r where r.customer_id=c.customer_id
)
from customer c ;

Joining three tables with aggregation

I have the following items table:
items:
id pr1 pr2 pr3
-------------------
1 11 22 tt
...
and two tables associated with the items:
comments:
item_id text
-------------
1 "cool"
1 "very good"
...
tags:
item_id tag
-------------
1 "life"
1 "drug"
...
Now I want to get a table with columns item_id, pr1, pr2, count(comments), count(tags) with a condition WHERE pr3 = zz. What is the best way to get it? I can do this by creating additional tables, but I was wondering if there is a way achieve this by using only a single SQL statement. I'm using Postgres 9.3.
The easiest way is certainly to get the counts in the select clause:
select
id,
pr1,
pr2,
(select count(*) from comments where item_id = items.id) as comment_count,
(select count(*) from tags where item_id = items.id) as tag_count
from items;
You can just join, but you need to be careful that you don't get double count. E.g. you can use a subqueries to get what you want.
SELECT i.id,i.pr1,i.pr2, commentcount,tagcount FROM
items i
INNER JOIN
(SELECT item_id,count(*) as commentcount from comments GROUP BY item_id) c
ON i.id = c.item_id
INNER JOIN
(SELECT item_id,count(*) as tagcount from tags GROUP BY item_id) t
ON i.id = t.item_id
[EDIT] based on the comment, here's the left join version...
SELECT i.id,i.pr1,i.pr2, coalesce(commentcount,0) as commentcount,
coalesce(tagcount,0) as tagcount FROM
items i
LEFT JOIN
(SELECT item_id,count(*) as commentcount from comments GROUP BY item_id) c
ON i.id = c.item_id
LEFT JOIN
(SELECT item_id,count(*) as tagcount from tags GROUP BY item_id) t
ON i.id = t.item_id
Try this:
SELECT i.id, i.pr1, i.pr2, A.commentCount, B.tagCount
FROM items i
LEFT OUTER JOIN (SELECT item_id, COUNT(1) AS commentCount
FROM comments
GROUP BY item_id
) AS A ON i.id = A.item_id
LEFT OUTER JOIN (SELECT item_id, count(1) as tagCount
FROM tags
GROUP BY item_id
) AS B ON i.id = B.item_id;
select
i.id
, i.pr1
, i.pr2
, count(c.item_id) as count_comments
, count(t.item_id) as count_tags
from items i
left outer join comments c on i.id = c.item_id
left outer join tags t on i.id = t.item_id
group by i.id, i.pr1, i.pr2
I've used a LEFT OUTER JOIN to also return counts of zero.

Using sum with a nested select

I'm using SQL Server. This statement lists my products per menu:
SELECT menuname, productname
FROM [web].[dbo].[tblMenus]
FULL OUTER JOIN [web].[dbo].[tblProductsRelMenus]
ON [tblMenus].Id = [tblProductsRelMenus].MenuId
FULL OUTER JOIN [web].[dbo].[tblProducts]
ON [tblProductsRelMenus].ProductId = [tblProducts].ProductId
LEFT JOIN [web].[dbo].[tblOrderDetails]
ON ([tblProducts].Id = [tblOrderDetails].ProductId)
GROUP BY [tblProducts].ProductName
Some products don't have menus and vice versa. I use the following to establish what has been sold of each product.
SELECT [tblProducts].ProductName, SUM([tblOrderDetails].Ammount) as amount
FROM [web].[dbo].[tblProducts]
LEFT JOIN [web].[dbo].[tblOrderDetails]
ON ([tblProducts].ProductId = [tblOrderDetails].ProductId)
GROUP BY [tblProducts].ProductName
What I want to do is complement the top table with an amount column. That is, I want a table with the same number of rows as in the first table above but with an amount value if it exists, otherwise null.
I can't figure out how to do this. Any suggestions?
If I am not missing anything, the second query could be simplified, then incorporated into the first query like this:
SELECT
m.menuname,
p.productname,
t.amount
FROM [web].[dbo].[tblMenus] m
FULL JOIN [web].[dbo].[tblProductsRelMenus] pm ON m.Id = pm.MenuId
FULL JOIN [web].[dbo].[tblProducts] p ON pm.ProductId = p.ProductId
LEFT JOIN (
SELECT ProductId, SUM(Amount) as amount
FROM [web].[dbo].[tblOrderDetails]
GROUP BY ProductId
) t ON p.ProducId = t.ProductId