I have the following query which i am directly executing in my Code & putting it in datatable. The problem is it is taking more than 10 minutes to execute this query. The main part which is taking time is NON EXISTS.
SELECT
[t0].[PayrollEmployeeId],
[t0].[InOutDate],
[t0].[InOutFlag],
[t0].[InOutTime]
FROM [dbo].[MachineLog] AS [t0]
WHERE
([t0].[CompanyId] = 1)
AND ([t0].[InOutDate] >= '2016-12-13')
AND ([t0].[InOutDate] <= '2016-12-14')
AND
( NOT (EXISTS(
SELECT NULL AS [EMPTY]
FROM [dbo].[TO_Entry] AS [t1]
WHERE
([t1].[EmployeeId] = [t0].[PayrollEmployeeId])
AND ([t1]. [CompanyId] = 1)
AND ([t0].[PayrollEmployeeId] = [t1].[EmployeeId])
AND (([t0].[InOutDate]) = [t1].[Entry_Date])
AND ([t1].[Entry_Method] = 'M')
))
)
ORDER BY
[t0].[PayrollEmployeeId], [t0].[InOutDate]
Is there any way i can optimize this query? What is the work around for this. It is taking too much of time.
It seems that you can convert the NOT EXISTS into a LEFT JOIN query with second table returning NULL values
Please check following SELECT and modify if required to fulfill your requirements
SELECT
[t0].[PayrollEmployeeId], [t0].[InOutDate], [t0].[InOutFlag], [t0].[InOutTime]
FROM [dbo].[MachineLog] AS [t0]
LEFT JOIN [dbo].[TO_Entry] AS [t1]
ON [t1].[EmployeeId] = [t0].[PayrollEmployeeId]
AND [t0].[PayrollEmployeeId] = [t1].[EmployeeId]
AND [t0].[InOutDate] = [t1].[Entry_Date]
AND [t1]. [CompanyId] = 1
AND [t1].[Entry_Method] = 'M'
WHERE
([t0].[CompanyId] = 1)
AND ([t0].[InOutDate] >= '2016-12-13')
AND ([t0].[InOutDate] <= '2016-12-14')
AND [t1].[EmployeeId] IS NULL
ORDER BY
[t0].[PayrollEmployeeId], [t0].[InOutDate]
You will realize that there is an informative message on the execution plan for your query
It is informing that there is a missing cluster index with an effect of 30% on the execution time
It seems that transaction data is occurring based on some date fields like Entry time.
Dates fields especially on your case are strong candidates for clustered indexes. You can create an index on Entry_Date column
I guess you have already some index on InOutDate
You can try indexing this field as well
I have the following SQL statement but it takes 20 seconds to run, how can I make it faster ?
SELECT TOP (100) PERCENT
dbo.pod.order_no,
dbo.pod.order_line_no,
dbo.poh.currency,
dbo.pod.warehouse,
dbo.pod.product,
dbo.poh.address1,
dbo.pod.description,
dbo.pod.date_required,
dbo.pod.qty_ordered,
dbo.pod.qty_received,
dbo.pod.qty_invoiced,
dbo.pod.status,
dbo.poh.date_entered,
dbo.stock.analysis_c,
dbo.stock.catalogue_number,
dbo.stock.drawing_number,
dbo.poh.date_required AS OriginalRequiredDate,
dbo.stock.standard_cost,
dbo.poh.supplier_ref,
dbo.stock.reorder_days,
dbo.pod.local_expect_cost,
dbo.poh.supplier,
dbo.pod.qty_ordered - dbo.pod.qty_received AS qty_outstanding,
dbo.stock.warehouse AS warehouse2
FROM dbo.stock
RIGHT OUTER JOIN dbo.pod
ON dbo.stock.product = dbo.pod.product
LEFT OUTER JOIN dbo.poh
ON dbo.pod.order_no = dbo.poh.order_no
WHERE (dbo.pod.status <> 'C')
AND (dbo.poh.status <> '9')
AND (dbo.stock.analysis_c IN ('FB', 'FP', 'RM', '[PK]'))
AND (dbo.pod.qty_ordered - dbo.pod.qty_received > 0)
AND (dbo.stock.warehouse = 'FH')
The execution plan says remote Query taking up 89% - These tables are located through a linked server.
I'd move the (dbo.stock.warehouse = 'FH') up the where clause to be the first item in the where clause since this is your main table. I'd then run the query through query profiler to see where the lag is, this might help narrow the area that needs to change
As in comments, there shouldn't be any TOP statement (what keeps putting it automatically?).
I'd rewrite your view like that (for readability):
SELECT P.order_no
, P.order_line_no
, T.currency
, P.warehouse
, P.product
, T.address1
, P.[Description]
, P.date_required
, P.qty_ordered
, P.qty_received
, P.qty_invoiced
, P.[Status]
, T.date_entered
, S.analysis_c
, S.catalogue_number
, S.drawing_number
, T.date_required AS OriginalRequiredDate
, S.standard_cost
, T.supplier_ref
, S.reorder_days
, P.local_expect_cost
, T.supplier
, P.qty_ordered - P.qty_received AS qty_outstanding
, S.warehouse AS warehouse2
FROM dbo.stock AS S
RIGHT JOIN dbo.pod AS P
ON S.product = P.product
LEFT JOIN dbo.poh AS T
ON P.order_no = T.order_no
WHERE P.[Status] <> 'C'
AND T.[Status] <> '9'
AND S.analysis_c IN ('FB', 'FP', 'RM', '[PK]')
AND P.qty_ordered - P.qty_received > 0
AND S.warehouse = 'FH';
Also, I'd create following indexes, which should increase performance (hopefully I didn't miss any columns):
CREATE NONCLUSTERED INDEX idx_Stock_product_warehouse_analysisC_iColumns
ON dbo.Stock (product, warehouse, analysis_c)
INCLUDE (catalogue_number, drawing_number, standard_cost, reorder_days);
CREATE NONCLUSTERED INDEX idx_Pod_product_orderNo_status_qtyOrdered_qtyReceived_iColumns
ON dbo.Pod (product, order_no, [status], qty_ordered, qty_received)
INCLUDE (order_line_no, warehouse, [Description], date_required, qty_invoiced, [status], local_expect_cost);
CREATE NONCLUSTERED INDEX idx_Poh_orderNo_Status_iColumns
ON dbo.Poh (order_no, [Status])
INCLUDE (currency, address1, date_entered, date_required, supplier_ref, supplier);
Since there isn't really much to work on, just general guesses that what could help. You have 5 criteria in your SQL that could help to reduce the amount of rows:
pod.status <> 'C'
pod.qty_ordered - pod.qty_received > 0
poh.status <> '9'
stock.analysis_c IN ('FB', 'FP', 'RM', '[PK]')
stock.warehouse = 'FH'
For each of these the selectivity of the criteria is essential. For example if 90% of your rows have pod.status C, then you should probably add a filtered index for status <> 'C' (and same thing with poh.status field too).
For stock table warehouse (and analysis_c): If the given criteria limits the data a lot, adding index to the field should help.
if pod.qty_ordered is usually less or equal to pod.qty_received, it might be a good idea to add a computed persistent column and index that and use it in the where clause.
Since these fields are in different tables, the query should start from the one that limits the data most, so you might want to index that table only, the others might not help at all. Also I assume you have already indexes for the fields you're joining the tables with. If not, that's the first thing to look at. Always when adding new indexes, it of course has a (small) impact on inserts / updates.
If the query does a lot of key lookups, it might be help if you add all the other columns from that table as included columns into the index, but that also has a impact on updates / inserts.
I'm very new to SQL, and still learning. I'm using a reporting tool called Solarwinds Orion, and I'm honestly not sure how specific the query I have written is to the program, so if there's anything in the query that's confusing, let me know and I'll try to figure out if it's specific to the program or not.
The problem with the query I'm running is that it times out after a very long time (maybe an hour) of running. The database I'm using is huge. Unfortunately I don't really know how huge, but I've been told it's huge.
Is there anything I am doing wrong that would have a huge performance impact?
SELECT TOP 10000
Nodes.Caption AS NodeName,
NetflowApplicationSummary.AppName AS Application_Name,
SUM(NetflowApplicationSummary.TotalBytes) AS SUM_of_Bytes_Transferred,
AVG(Case OutBandwidth
When 0 Then 0
Else (NetflowApplicationSummary.TotalBytes/OutBandwidth) * 100
End) AS TEST_PERCENT
FROM
((NetflowApplicationSummary
INNER JOIN Nodes ON (NetflowApplicationSummary.NodeID = Nodes.NodeID))
INNER JOIN InterfaceTraffic ON (Nodes.NodeID = InterfaceTraffic.InterfaceID))
INNER JOIN Interfaces ON (Nodes.NodeID = Interfaces.NodeID)
WHERE
( InterfaceTraffic.DateTime > (GetDate()-30) )
AND
(Nodes.WANCircuit = 1)
GROUP BY Nodes.Caption, NetflowApplicationSummary.AppName
EDIT: I ran COUNT() on each of my tables with the below result.
SELECT COUNT(*) FROM NetflowApplicationSummary # 50671011
SELECT COUNT(*) FROM Nodes # 898
SELECT COUNT(*) FROM InterfaceTraffic # 18000166
SELECT COUNT(*) FROM Interfaces # 3938
# Total : 68,676,013
I really have no idea if 68 million items is a huge database to be honest.
A couple of notes:
The INNER JOIN operator is associative, so get rid of those parenthesis in the FROM clause and let the optimizer figure out the best join order.
You may have an implied cursor from the getdate() function being called for every row. Store the value in a local variable and compare to that.
The resulting SQL should look like this:
DECLARE #Date as datetime = getdate() - 30;
SELECT TOP 10000
Nodes.Caption AS NodeName,
NetflowApplicationSummary.AppName AS Application_Name,
SUM(NetflowApplicationSummary.TotalBytes) AS SUM_of_Bytes_Transferred,
AVG(Case OutBandwidth
When 0 Then 0
Else (NetflowApplicationSummary.TotalBytes/OutBandwidth) * 100
End) AS TEST_PERCENT
FROM NetflowApplicationSummary
INNER JOIN Nodes ON NetflowApplicationSummary.NodeID = Nodes.NodeID
INNER JOIN InterfaceTraffic ON Nodes.NodeID = InterfaceTraffic.InterfaceID
INNER JOIN Interfaces ON Nodes.NodeID = Interfaces.NodeID
WHERE InterfaceTraffic.DateTime > #Date
AND Nodes.WANCircuit = 1
GROUP BY Nodes.Caption, NetflowApplicationSummary.AppName
Also, make sure you have an index on table InterfaceTraffic with a leading field of DateTime. If this doesn't exist you may need to pay the penalty of a first time creation of it.
If this doesn't help, then you may need to post the execution plan where it can be inspected.
Out of interest, also perform a count() on all four tables and post that result, just so members here can make their own assessment of how big your database really is. It is amazing how many non-technical people still think a 1 or 10 GB database is huge, while I run that easily on my workstation!
I have a SQL Query that comprise of two level sub-select. This is taking too much time.
The Query goes like:
select * from DALDBO.V_COUNTRY_DERIV_SUMMARY_XREF
where calculation_context_key = 130205268077
and DERIV_POSITION_KEY in
(select ctry_risk_derivs_psn_key
from DALDBO.V_COUNTRY_DERIV_PSN
where calculation_context_key = 130111216755
--and ctry_risk_derivs_psn_key = 76296412
and CREDIT_PRODUCT_TYPE = 'SWP OP'
and CALC_OBLIGOR_COUNTRY_OF_ASSETS in
(select ctry_cd
from DALDBO.V_PSN_COUNTRY
where calculation_context_key = 130134216755
--and ctry_risk_derivs_psn_key = 76296412
)
)
These tables are huge! Is there any optimizations available?
Without knowing anything about your table or view definitions, indexing, etc. I would start by looking at the sub-selects and ensuring that they are performing optimally. I would also want to know how many values are being returned by each sub-select as this can impact performance.
How is calculation_context_key used to retrieve rows from V_COUNTRY_DERIV_PSN and V_PSN_COUNTRY? Is it an optimal execution plan?
How is DERIV_POSITION_KEY and CALC_OBLIGOR_COUNTRY_OF_ASSETS used in V_COUNTRY_DERIV_SUMMARY_XREF to retrieve rows? Again, look at the explain plan.
first of all, can you write this query using inner joins (and not subselect) ??
select A.*
from DALDBO.V_COUNTRY_DERIV_SUMMARY_XREF a,
DALDBO.V_COUNTRY_DERIV_PSN b,
DALDBO.V_PSN_COUNTRY c
where calculation_context_key = 130205268077
and a.DERIV_POSITION_KEY = b.ctry_risk_derivs_psn_key
and b.calculation_context_key = 130111216755
--and b.ctry_risk_derivs_psn_key = 76296412
and b.CREDIT_PRODUCT_TYPE = 'SWP OP'
and b.CALC_OBLIGOR_COUNTRY_OF_ASSETS = c.ctry_cd
and c.calculation_context_key = 130134216755
--and c.ctry_risk_derivs_psn_key = 76296412
second, best practice says that when you don't query any data from the tables in the subselect you better of using an EXISTS instead of IN. new versions of oracle does that automatically and actually rewrite the whole thing as an inner join.
last, without any knowledge on you data and of what you are trying to do i would suggest you to try and use views as less as you can - if you can query the underling tables it would be best and you will probably see immediate performance improvement.
I have a query joining 4 tables with a lot of conditions in the WHERE clause. The query also includes ORDER BY clause on a numeric column. It takes 6 seconds to return which is too long and I need to speed it up. Surprisingly I found that if I remove the ORDER BY clause it takes 2 seconds. Why the order by makes so massive difference and how to optimize it? I am using SQL server 2005. Many thanks.
I cannot confirm that the ORDER BY makes big difference since I am clearing the execution plan cache. However can you shed light at how to speed this up a little bit? The query is as follows (for simplicity there is "SELECT *" but I am only selecting the ones I need).
SELECT *
FROM View_Product_Joined j
INNER JOIN [dbo].[OPR_PriceLookup] pl on pl.siteID = NodeSiteID and pl.skuid = j.skuid
LEFT JOIN [dbo].[OPR_InventoryRules] irp on irp.ID = pl.SkuID and irp.InventoryRulesType = 'Product'
LEFT JOIN [dbo].[OPR_InventoryRules] irs on irs.ID = pl.siteID and irs.InventoryRulesType = 'Store'
WHERE (((((SiteName = N'EcommerceSite') AND (Published = 1)) AND (DocumentCulture = N'en-GB')) AND (NodeAliasPath LIKE N'/Products/Cats/Computers/Computer-servers/%')) AND ((NodeSKUID IS NOT NULL) AND (SKUEnabled = 1) AND pl.PriceLookupID in (select TOP 1 PriceLookupID from OPR_PriceLookup pl2 where pl.skuid = pl2.skuid and (pl2.RoleID = -1 or pl2.RoleId = 13) order by pl2.RoleID desc)))
ORDER BY NodeOrder ASC
Why the order by makes so massive difference and how to optimize it?
The ORDER BY needs to sort the resultset which may take long if it's big.
To optimize it, you may need to index the tables properly.
The index access path, however, has its drawbacks so it can even take longer.
If you have something other than equijoins in your query, or the ranged predicates (like <, > or BETWEEN, or GROUP BY clause), then the index used for ORDER BY may prevent the other indexes from being used.
If you post the query, I'll probably be able to tell you how to optimize it.
Update:
Rewrite the query:
SELECT *
FROM View_Product_Joined j
LEFT JOIN
[dbo].[OPR_InventoryRules] irp
ON irp.ID = j.skuid
AND irp.InventoryRulesType = 'Product'
LEFT JOIN
[dbo].[OPR_InventoryRules] irs
ON irs.ID = j.NodeSiteID
AND irs.InventoryRulesType = 'Store'
CROSS APPLY
(
SELECT TOP 1 *
FROM OPR_PriceLookup pl
WHERE pl.siteID = j.NodeSiteID
AND pl.skuid = j.skuid
AND pl.RoleID IN (-1, 13)
ORDER BY
pl.RoleID desc
) pl
WHERE SiteName = N'EcommerceSite'
AND Published = 1
AND DocumentCulture = N'en-GB'
AND NodeAliasPath LIKE N'/Products/Cats/Computers/Computer-servers/%'
AND NodeSKUID IS NOT NULL
AND SKUEnabled = 1
ORDER BY
NodeOrder ASC
The relation View_Product_Joined, as the name suggests, is probably a view.
Could you please post its definition?
If it is indexable, you may benefit from creating an index on View_Product_Joined (SiteName, Published, DocumentCulture, SKUEnabled, NodeOrder).