How can I optimize this query and replace the MAX function?

How can I optimize this query and replace the MAX function? - sql

SELECT MAX(f_orderInteractionID)
FROM tOrders O
INNER JOIN tOrderInteractions OI ON OI.fk_orderID = O.f_orderID
GROUP BY OI.fk_orderID
Is there a way to replace the max function because on the actual execution plan it uses Index scan and I prefer using an Index seek. How can I improve this query

To do this, try again
SELECT f_orderInteractionID
FROM tOrders O
INNER JOIN tOrderInteractions OI ON OI.fk_orderID = O.f_orderID
ORDER BY f_orderInteractionID DESC
LIMIT 1

Related

JOIN query optimization using indexes

The question is: How to increase the speed of this query?
SELECT
c.CategoryName,
sc.SubcategoryName,
pm.ProductModel,
COUNT(p.ProductID) AS ModelCount
FROM Marketing.ProductModel pm
JOIN Marketing.Product p
ON p.ProductModelID = pm.ProductModelID
JOIN Marketing.Subcategory sc
ON sc.SubcategoryID = p.SubcategoryID
JOIN Marketing.Category c
ON c.CategoryID = sc.CategoryID
GROUP BY c.CategoryName,
sc.SubcategoryName,
pm.ProductModel
HAVING COUNT(p.ProductID) > 1
Schema:
I tried creating some indexes and reorganizing the order of the JOINs. This did not increase productivity in the least. Maybe I need other indexes or a different query?
My solution:
CREATE INDEX idx_Marketing_Subcategory_IDandName ON Marketing.Subcategory (CategoryID)
CREATE INDEX idx_Marketing_Product_PMID ON Marketing.Product (ProductModelID)
CREATE INDEX idx_Marketing_Product_SCID ON Marketing.Product (SubcategoryID)
SELECT
c.CategoryName,
sc.SubcategoryName,
pm.ProductModel,
COUNT(p.ProductID) AS ModelCount
FROM Marketing.Category AS c
JOIN Marketing.Subcategory AS SC
ON c.CategoryID = SC.CategoryID
JOIN Marketing.Product AS P
ON SC.SubcategoryID = p.SubcategoryID
JOIN Marketing.ProductModel AS PM
ON P.ProductModelID = PM.ProductModelID
GROUP BY c.CategoryName,
sc.SubcategoryName,
pm.ProductModel
HAVING COUNT(p.ProductID) > 1
UPD:
Results:
Plan with my indexes:
Plan

Your query has a cost of 0.12 which is trivial, as is the number of rows, it executes in microseconds, row esitmates are also reasonably close so it's not clear what the problem is you are trying to solve.
Looking at the execution plan there is a key lookup for ProductModelId with an estimated cost of 44% of the query, so you could eliminate this with a covering index by including the column in the index Product.idx_Marketing_Product_SCID
Create index idx_Marketing_Product_SCID on Marketing.Product (SubcategoryID)
include (ProductModelId) with(drop_existing=on)

Improve performance of query with multiple joins

I have a query with four joins that is taking a considerable amount of time to execute. Is there a way to optimize the query? I tried to include the smaller PORTFOLIO table on the joins to try speeding up the process.
SELECT
A.*
, B.REPORTING_PERIOD
, D.HPI AS CURRENT_HPI
, E.USSWAP10
, B.DLQ_STATUS AS CURRENT_STATUS
, C.DLQ_STATUS AS NEXT_STATUS
FROM PORTFOLIO A
JOIN ALL_PERFORMANCE B ON
A.AGENCY = B.AGENCY
AND A.LOAN_ID = B.LOAN_ID
JOIN ALL_PERFORMANCE C ON
A.AGENCY = C.AGENCY
AND A.LOAN_ID = C.LOAN_ID
AND DATEADD(MONTH, 1, B.REPORTING_PERIOD) = C.REPORTING_PERIOD
LEFT JOIN CASE_SHILLER D ON
A.GEO_CODE = D.GEO_CODE
AND B.REPORTING_PERIOD = D.AS_OF_DATE
LEFT JOIN SWAP_10Y E ON
B.REPORTING_PERIOD = E.AS_OF_DATE

You can index the columns you join on

Make sure you have indexes for the combinations of joins you are using to increase performance. Thesee are the indexes you should have:
Index on (PORTFOLIO.AGENCY, PORTFOLIO.LOAN_ID)
Index on (ALL_PERFORMANCE.AGENCY, ALL_PERFORMANCE.LOAN_ID)
Index on ALL_PERFORMANCE.REPORTING_PERIOD
Index on (ALL_PERFORMANCE.AGENCY, ALL_PERFORMANCE.LOAN_ID, ALL_PERFORMANCE.REPORTING_PERIOD)
Index on (CASE_SHILLER.GEO_CODE, CASE_SHILLER.AS_OF_DATE)
Index on PORTFOLIO.GEO_CODE
Index on ALL_PERFORMANCE.REPORTING_PERIOD
Index on SWAP_10Y.AS_OF_DATE
Index on ALL_PERFORMANCE.REPORTING_PERIOD

SQL optimization JOIN before table scan?

I have a SQL query similar to
SELECT columnName
FROM
(SELECT columnName, someColumnWithXml
FROM _Table1
INNER JOIN _Activity ON _Activity.oid = _Table1.columnName
INNER JOIN _ActivityType ON _Activity.activityType = _ActivityType.oid
--_ActivityType.forType is a string
WHERE _ActivityType.forType = '_Disclosure'
AND _Activity.emailRecipients IS NOT NULL) subquery
WHERE subquery.someColumnWithXml LIKE '%'+'9D62EE8855797448A7C689A09D193042'+'%'
There are 15 million rows in _Table1 and the WHERE subquery.someColumnWithXml LIKE '%'+'9D62EE8855797448A7C689A09D193042'+'%' results in an execution plan that performs a full table scan on all 15 million rows. The subquery results in only a few hundred thousand rows and those are all the rows that really need to have the LIKE run on them. Is there a way to make this more efficient by running the LIKE only on the results of the subquery rather than running a TABLE SCAN with a LIKE on 15,000,000 rows? The someColumnWithXML column is not indexed.

For this query:
SELECT columnName, someColumnWithXml
FROM _Table1 t1 INNER JOIN
_Activity a
ON a.oid = t1.columnName INNER JOIN
_ActivityType at
ON a.activityType = at.oid --_ActivityType.forType is a string
WHERE at.forType = '_Disclosure' AND
a.emailRecipients IS NOT NULL AND
t1.someColumnWithXml LIKE '%'+'9D62EE8855797448A7C689A09D193042'+'%';
You have a challenge with optimizing this query. I don't know if the filtering conditions are particularly restrictive. If they are, then indexes on:
_ActivityType(forType, oid)
_Activity(activityType, emailRecipients, oid)
_Table1(columnName)
If these don't help, then you might an index on the XML column. Perhaps an XML index would work. Such an index would not really help for a generic LIKE, but that might not be needed if you parse the XML.

You could filter the in the subquery directly avoinding the scan for unuseful rows
SELECT columnName, someColumnWithXml
FROM _Table1
INNER JOIN _Activity on _Activity.oid = _Table1.columnName
INNER JOIN _ActivityType on _Activity.activityType = _ActivityType.oid
--_ActivityType.forType is a string
WHERE _ActivityType.forType = '_Disclosure'
AND _Activity.emailRecipients IS NOT NULL
someColumnWithXml LIKE '%'+'9D62EE8855797448A7C689A09D193042'+'%'

Correct form long time query executing myodbc syntax

I'm trying to build one SQL query for Access that links tables with myodbc connection to retrive the data from internet, but the time to finish the query is too long about five minutes, so I think the problem is with the query:
SELECT COUNT([o].[orders_id]) AS howmany_orders,
(SELECT SUM([op1].[products_quantity]) FROM orders_total AS ot1, orders AS o1, orders_products AS op1
WHERE [o1].[date_purchased] >=date()-30 and [o1].[orders_id] = [op1].[orders_id] and [ot1].[orders_id] = [op1].[orders_id] and [ot1].[class]="ot_total" and [o1].[orders_status] = 1 and [op1].[products_id]=[op].[products_id]
GROUP BY [op1].[products_id]
) AS pendiente,
[op].[products_model],
Round((((7+1)*(howmany_orders/90))+1)-(p.stock_real- IIF(pendiente>0,pendiente,0)), 0) AS pedir,
p.ref_id
FROM orders_total AS ot, orders AS o, orders_products AS op INNER JOIN Productos AS p ON Mid([op].[products_model],4) LIKE p.ref_id
WHERE [o].[date_purchased] >=date()-90 and [o].[orders_id] = [op].[orders_id] and [ot].[orders_id] = [op].[orders_id] and [ot].[class]="ot_total" and [o].[orders_status] IN (7, 1) and ((p.fuera_de_stock)=False) and ((p.suspendido)=False) and ((p.quitar_de_la_web)=False)
GROUP BY [op].[products_model], p.ref_id, p.stock_real, [op].[products_id];
At a glance I see that the "LIKE" operator could be one of the problems here:
INNER JOIN Productos AS p ON Mid([op].[products_model],4) LIKE p.ref_id
but I have not way to substitute for an = operator
Thanks for your help!
EDITING:
I have reduced the query to that but is the same time:
SELECT COUNT(o.orders_id) AS howmany_orders, (
SELECT SUM(opz.products_quantity) FROM orders AS oz, orders_products AS opz WHERE oz.date_purchased >=date()-30 and oz.orders_id = opz.orders_id and oz.orders_status = 1 and opz.products_id=op.products_id GROUP BY opz.products_id
) AS pendiente, op.products_model, Round((((7+1)*(howmany_orders/90))+1)-(p.stock_real-IIf(pendiente>0,pendiente,0)),0) AS pedir, p.ref_id
FROM orders AS o, orders_products AS op INNER JOIN Productos AS p ON op.products_model=p.cod
WHERE o.date_purchased>=date()-90 And o.orders_id=op.orders_id And o.orders_status In (7,1) And ((p.suspendido)=False) And ((p.quitar_de_la_web)=False)
GROUP BY op.products_model, p.ref_id, p.stock_real, op.products_id;

Yep there’s your problem. The database can not do the join using any indexes so it has to do a table scan. Is there anyway you could persist this data so you don’t have to do the MID statement and just join on that? i.e. in your products_model table have an extra column with the MID data stored in there with the join on that column

Can SQL Sub-query return two/more values but still compare against one of them?

I have this query:
SELECT Items.Name, tblBooks.AuthorLastName, tblBooks.AuthorFirstName
FROM Items WHERE Items.ProductCode IN (
SELECT TOP 10 Recommended.ProductCode
FROM
Recommended
INNER JOIN Stock ON Recomended.ProductCode = Stock.ProductCode
AND Stock.StatusCode = 1
WHERE (Recommended.Type = 'TOPICAL') ORDER BY CHECKSUM(NEWID()));
It is fine for my data, except that the Recommended table has a SKU field I need also however I cannot put it next to Recommended.ProductCode and have the query still work.
I have used JOINS for this query and these work - but this query runs faster I just need the ProductCode and SKU from the Recommended table - how can this be done without needing yet another sub query?
Database: MS SQL Server 2000

The subquery seems to be picking 10 random recommendations. I think you can do that without a subquery:
SELECT TOP 10
Items.*,
Recommended.*,
Stock.*
FROM Items
INNER JOIN Recommended
ON Items.ProductCode = Recommended.ProductCode
AND Recommended.Type = 'TOPICAL'
INNER JOIN Stock
ON Recomended.ProductCode = Stock.ProductCode
AND Stock.StatusCode = 1
ORDER BY CHECKSUM(NEWID())
This gives you access to all columns, without having to pass them up from the subquery.

You can only return one value with the subselect, so you have to obtain the fields from the Recommended table by a join - which I presume is what you have already:
SELECT Items.Name, tblBooks.AuthorLastName, tblBooks.AuthorFirstName, Recommended.SKU
FROM Items
INNER JOIN Recommended ON Recommended.ProductCode = Items.ProductCode
WHERE Items.ProductCode IN (
SELECT TOP 10 Recommended.ProductCode
FROM
Recommended
INNER JOIN Stock ON Recomended.ProductCode = Stock.ProductCode
AND Stock.StatusCode = 1
WHERE (Recommended.Type = 'TOPICAL') ORDER BY CHECKSUM(NEWID()));
Most likely the Join in reality is an outer too I guess. This really shouldn't have any performance issues so long as you have both the Items and and Recommended tables indexed on ProductCode.

I think you need to move the subquery out of the where clause:
SELECT Items.Name, tblBooks.AuthorLastName, tblBooks.AuthorFirstName, R.SKU
FROM Items
INNER JOIN
(SELECT TOP 10 Recommended.ProductCode, Recommended.SKU FROM Recommended
INNER JOIN Stock ON Recommended.ProductCode = Stock.ProductCode AND
Stock.StatusCode = 1 WHERE (Recommended.Type = 'TOPICAL')
ORDER BY CHECKSUM(NEWID()))
AS Rec ON Items.ProductCode = Rec.ProductCode;
The above is valid syntax in MySQL, your mileage may vary...

Under those circumstances I would normally use an inner join to get the row filtering from the where clause I needed and the extra columns. Something like below; if this is what you did that gave you a performance hit then you might need to flip the query; go from recommended and join to items; as that will probably lead to more data filtering before the join.
SELECT Items.Name, tblBooks.AuthorLastName, tblBooks.AuthorFirstName
FROM Items
Inner Join
(
SELECT TOP 10 Recommended.ProductCode, SKUID
FROM
Recommended
INNER JOIN Stock ON Recomended.ProductCode = Stock.ProductCode
AND Stock.StatusCode = 1
WHERE (Recommended.Type = 'TOPICAL')
) reccomended
on items.productcode - reccomended.ProductCode
ORDER BY CHECKSUM(NEWID()

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How can I optimize this query and replace the MAX function? - sql

SELECT MAX(f_orderInteractionID) FROM tOrders O INNER JOIN tOrderInteractions OI ON OI.fk_orderID = O.f_orderID GROUP BY OI.fk_orderID Is there a way to replace the max function because on the actual execution plan it uses Index scan and I prefer using an Index seek. How can I improve this query

To do this, try again SELECT f_orderInteractionID FROM tOrders O INNER JOIN tOrderInteractions OI ON OI.fk_orderID = O.f_orderID ORDER BY f_orderInteractionID DESC LIMIT 1

Related

JOIN query optimization using indexes

Improve performance of query with multiple joins

SQL optimization JOIN before table scan?

Correct form long time query executing myodbc syntax

Can SQL Sub-query return two/more values but still compare against one of them?

Categories

Resources