The below query that I'm executing through SQL Server Management Studio is painfully slow.
The input table tbl_sb12_bhs has about 40000 records and after an hour only 40 records are processed.
What can be changed here to make this run a bit faster?
DECLARE #bsrange INT
SET #bsrange = 0
WHILE #bsrange <= (SELECT max([p_a_l_out])
FROM [DB001].[FD\f7].[tbl_sb12_bhs])
BEGIN
INSERT INTO [FD\f7].tbl_sb13_b_lin1
(aId,
p_a_l_out,
bs_id,
bs_db,
bs_tbl,
bs_column,
Int1,
cd1,
Hop1,
Int2,
cd2,
Hop2,
Int3,
cd3,
Hop3,
Int4,
cd4,
Hop4,
Int5,
cd5,
Hop5,
Int6,
cd6,
Hop6,
Int7,
cd7,
Hop7,
Int8,
cd8,
Hop8,
Int9,
cd9,
Hop9,
Int10,
cd10,
Hop10,
Int11,
cd11,
Hop11,
Int12,
cd12,
Hop12,
Int13,
cd13,
Hop13,
Int14,
cd14,
Hop14,
Int15,
cd15,
Hop15,
Int16,
cd16,
Hop16)
SELECT DISTINCT tbl_sb12_bhs.aId,
tbl_sb12_bhs.p_a_l_out,
tbl_sb12_bhs.bs_id,
tbl_sb12_bhs.bs_db,
tbl_sb12_bhs.bs_tbl,
tbl_sb12_bhs.bs_column,
tbl_rpt_val_pt_crl.pt_el_Int AS Int1,
tbl_rpt_val_pt_crl.user_cd AS cd1,
tbl_rpt_val_pt_crl.cfk_upel AS Hop1,
tbl_rpt_val_pt_crl_1.pt_el_Int AS Int2,
tbl_rpt_val_pt_crl_1.user_cd AS cd2,
tbl_rpt_val_pt_crl_1.cfk_upel AS Hop2,
tbl_rpt_val_pt_crl_2.pt_el_Int AS Int3,
tbl_rpt_val_pt_crl_2.user_cd AS cd3,
tbl_rpt_val_pt_crl_2.cfk_upel AS Hop3,
tbl_rpt_val_pt_crl_3.pt_el_Int AS Int4,
tbl_rpt_val_pt_crl_3.user_cd AS cd4,
tbl_rpt_val_pt_crl_3.cfk_upel AS Hop4,
tbl_rpt_val_pt_crl_4.pt_el_Int AS Int5,
tbl_rpt_val_pt_crl_4.user_cd AS cd5,
tbl_rpt_val_pt_crl_4.cfk_upel AS Hop5,
tbl_rpt_val_pt_crl_5.pt_el_Int AS Int6,
tbl_rpt_val_pt_crl_5.user_cd AS cd6,
tbl_rpt_val_pt_crl_5.cfk_upel AS Hop6,
tbl_rpt_val_pt_crl_6.pt_el_Int AS Int7,
tbl_rpt_val_pt_crl_6.user_cd AS cd7,
tbl_rpt_val_pt_crl_6.cfk_upel AS Hop7,
tbl_rpt_val_pt_crl_7.pt_el_Int AS Int8,
tbl_rpt_val_pt_crl_7.user_cd AS cd8,
tbl_rpt_val_pt_crl_7.cfk_upel AS Hop8,
tbl_rpt_val_pt_crl_8.pt_el_Int AS Int9,
tbl_rpt_val_pt_crl_8.user_cd AS cd9,
tbl_rpt_val_pt_crl_8.cfk_upel AS Hop9,
tbl_rpt_val_pt_crl_9.pt_el_Int AS Int10,
tbl_rpt_val_pt_crl_9.user_cd AS cd10,
tbl_rpt_val_pt_crl_9.cfk_upel AS Hop10,
tbl_rpt_val_pt_crl_10.pt_el_Int AS Int11,
tbl_rpt_val_pt_crl_10.user_cd AS cd11,
tbl_rpt_val_pt_crl_10.cfk_upel AS Hop11,
tbl_rpt_val_pt_crl_11.pt_el_Int AS Int12,
tbl_rpt_val_pt_crl_11.user_cd AS cd12,
tbl_rpt_val_pt_crl_11.cfk_upel AS Hop12,
tbl_rpt_val_pt_crl_12.pt_el_Int AS Int13,
tbl_rpt_val_pt_crl_12.user_cd AS cd13,
tbl_rpt_val_pt_crl_12.cfk_upel AS Hop13,
tbl_rpt_val_pt_crl_13.pt_el_Int AS Int14,
tbl_rpt_val_pt_crl_13.user_cd AS cd14,
tbl_rpt_val_pt_crl_13.cfk_upel AS Hop14,
tbl_rpt_val_pt_crl_14.pt_el_Int AS Int15,
tbl_rpt_val_pt_crl_14.user_cd AS cd15,
tbl_rpt_val_pt_crl_14.cfk_upel AS Hop15,
tbl_rpt_val_pt_crl_15.pt_el_Int AS Int16,
tbl_rpt_val_pt_crl_15.user_cd AS cd16,
tbl_rpt_val_pt_crl_15.cfk_upel AS Hop16
FROM (SELECT DISTINCT pk_a AS aId,
p_a_l_out,
bs_id,
bs_db,
bs_tbl,
bs_column,
hop_pt_id_1,
hop_pt_id_2,
hop_pt_id_3,
hop_pt_id_4,
hop_pt_id_5,
hop_pt_id_6,
hop_pt_id_7,
hop_pt_id_8,
hop_pt_id_9,
hop_pt_id_10,
hop_pt_id_11,
hop_pt_id_12,
hop_pt_id_13,
hop_pt_id_14,
hop_pt_id_15,
hop_pt_id_16
FROM [FD\f7].tbl_sb12_bhs
WHERE [p_a_l_out] >= #bsrange
AND [p_a_l_out] < ( #bsrange + 1 )) AS tbl_sb12_bhs
LEFT JOIN tbl_rpt_val_pt_crl
ON tbl_sb12_bhs.hop_pt_id_1 = tbl_rpt_val_pt_crl.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_1
ON tbl_sb12_bhs.hop_pt_id_2 = tbl_rpt_val_pt_crl_1.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_2
ON tbl_sb12_bhs.hop_pt_id_3 = tbl_rpt_val_pt_crl_2.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_3
ON tbl_sb12_bhs.hop_pt_id_4 = tbl_rpt_val_pt_crl_3.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_4
ON tbl_sb12_bhs.hop_pt_id_5 = tbl_rpt_val_pt_crl_4.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_5
ON tbl_sb12_bhs.hop_pt_id_6 = tbl_rpt_val_pt_crl_5.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_6
ON tbl_sb12_bhs.hop_pt_id_7 = tbl_rpt_val_pt_crl_6.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_7
ON tbl_sb12_bhs.hop_pt_id_8 = tbl_rpt_val_pt_crl_7.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_8
ON tbl_sb12_bhs.hop_pt_id_9 = tbl_rpt_val_pt_crl_8.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_9
ON tbl_sb12_bhs.hop_pt_id_10 = tbl_rpt_val_pt_crl_9.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_10
ON tbl_sb12_bhs.hop_pt_id_11 = tbl_rpt_val_pt_crl_10.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_11
ON tbl_sb12_bhs.hop_pt_id_12 = tbl_rpt_val_pt_crl_11.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_12
ON tbl_sb12_bhs.hop_pt_id_13 = tbl_rpt_val_pt_crl_12.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_13
ON tbl_sb12_bhs.hop_pt_id_14 = tbl_rpt_val_pt_crl_13.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_14
ON tbl_sb12_bhs.hop_pt_id_15 = tbl_rpt_val_pt_crl_14.sk_el_pt
LEFT JOIN tbl_rpt_val_pt_crl AS tbl_rpt_val_pt_crl_15
ON tbl_sb12_bhs.hop_pt_id_16 = tbl_rpt_val_pt_crl_15.sk_el_pt
SET #bsrange = #bsrange + 1
END
My best guess is that it's slow because you're doing a number of intensive operations all in one go. Without any sample data it's tough, but I can try to make a few suggestions.
From what you said about it only processing 40 records after an hour, it's what's going on inside the loop that's slowing you down.
SELECT DISTINCT isn't cheap because it has to compare all the data, and you're comparing quite a lot of columns as well. If you can, it might run quicker if you limit the number of columns to the bare minimum required for a distinct selection then self joining that to the original table. It should be simple enough to test in isolation to the rest of it to make sure you're getting the same results and whether or not it's quicker.
Also the more joins you have, the worse the performance is in general... the price we pay for normalisation.
Anyway, I would take a step back from it and try to break this down into its smallest units of work and then you can test each one individually until you find the culprit. In doing so, you might think of a much better way to do this. Again, without any sample data this is a difficult one for me to help with.
Well if you have an index or indexes on the target then SQL will reindex every row. I'd disable any indexes on the target table and then renable them when the insert is complete. Id batch the inserts inro ranges of (say) 5k records so any blocking will be reduced, or I'd create a temp file as a result of the select and bcp in the results. Because your doing that horrendous set of left joins each time prior to one record insert. SQL just cant optimise more than about 7 or 8 left or right joins. My guess is that there are little or no indexes on the table being inserted from which means a table scan on for each join or around 17 tables scans for each one row inserted. Sorry but this approach is wrong at every stage. Or you could get you boss to buy you a datecentre....
Another thing you might want to do is do the join initially into a temp table and then reference that. You would not have to do the distinct or joins every time. Just add the where clause for the bsrange.
So it would be something like:
Create temporary table with as much of the joins/distinct as you can.
while.....
insert into [FD\f7].tbl_sb13_b_lin1
select * from temptable where [p_a_l_out] >= #bsrange
AND [p_a_l_out] < ( #bsrange + 1 )
Related
This code is taking a significant amount of time to run. It's returning every single transaction within the date range but I just need to know if the customer has had at least one transaction, then include the CustomerID, CustomerName, Type, Sign, ReportingName.
I think I need to GROUP BY 'CustomerID' but again only if there was a transaction within the date range. And of course, I'm sure there is an optimal way to execute the below TSQL because it's quite slow at present.
Thanks in advance for any help!
SELECT [ABC].[dbo].[vwPrimary].[RelatedNameId] AS CustomerID
,[ABC].[dbo].[vwPrimary].[RelatedName] AS CustomerName
,[AFGPurchase].[IvL].[TaxTreatment].[ParticluarType] AS Type
,[AFGPurchase].[IvL].[Product].[Sign] AS [Sign]
,[AFGPurchase].[IvL].[Product].[ReportingName] AS ReportingName
,[AFGPurchase].[IvL].[Transaction].[EffectiveDate] AS 'Date'
FROM (((([AFGPurchase].[IvL].[Account]
INNER JOIN [AFGPurchase].[IvL].[Position] ON [AFGPurchase].[IvL].[Account].[AccountId] = [AFGPurchase].[IvL].[Position].[AccountId])
INNER JOIN [AFGPurchase].[IvL].[Product] ON [AFGPurchase].[IvL].[Position].[ProductID] = [AFGPurchase].[IvL].[Product].[ProductId])
INNER JOIN [ABC].[dbo].[vwPrimary] ON [AFGPurchase].[IvL].[Account].[ReportingEntityId] = [ABC].[dbo].[vwPrimary].[RelatedNameId])
INNER JOIN [AFGPurchase].[IvL].[TaxTreatment] ON [AFGPurchase].[IvL].[Account].[TaxTreatmentId] = [AFGPurchase].[IvL].[TaxTreatment].[TaxTreatmentId])
INNER JOIN [AFGPurchase].[IvL].[Transaction] ON [AFGPurchase].[IvL].[Position].[PositionId] = [AFGPurchase].[IvL].[Transaction].[PositionId]
WHERE ((([AFGPurchase].[IvL].[TaxTreatment].[RegistrationType]) LIKE 'NON%')
AND (([AFGPurchase].[IvL].[Product].[Sign])='XYZ2')
AND (([AFGPurchase].[IvL].[Position].[Quantity])<>0)
AND (([AFGPurchase].[IvL].[Transaction].[EffectiveDate]) between '2021-12-31' and '2022-12-31'))
Check your indexes on fragmentation, to speed up your query. And make sure you have indexes.
If you just need one result, just TOP 1
SELECT TOP 1 [ABC].[dbo].[vwPrimary].[RelatedNameId] AS CustomerID
,[ABC].[dbo].[vwPrimary].[RelatedName] AS CustomerName
,[AFGPurchase].[IvL].[TaxTreatment].[ParticluarType] AS Type
,[AFGPurchase].[IvL].[Product].[Sign] AS [Sign]
,[AFGPurchase].[IvL].[Product].[ReportingName] AS ReportingName
,[AFGPurchase].[IvL].[Transaction].[EffectiveDate] AS 'Date'
FROM (((([AFGPurchase].[IvL].[Account]
INNER JOIN [AFGPurchase].[IvL].[Position] ON [AFGPurchase].[IvL].[Account].[AccountId] = [AFGPurchase].[IvL].[Position].[AccountId])
INNER JOIN [AFGPurchase].[IvL].[Product] ON [AFGPurchase].[IvL].[Position].[ProductID] = [AFGPurchase].[IvL].[Product].[ProductId])
INNER JOIN [ABC].[dbo].[vwPrimary] ON [AFGPurchase].[IvL].[Account].[ReportingEntityId] = [ABC].[dbo].[vwPrimary].[RelatedNameId])
INNER JOIN [AFGPurchase].[IvL].[TaxTreatment] ON [AFGPurchase].[IvL].[Account].[TaxTreatmentId] = [AFGPurchase].[IvL].[TaxTreatment].[TaxTreatmentId])
INNER JOIN [AFGPurchase].[IvL].[Transaction] ON [AFGPurchase].[IvL].[Position].[PositionId] = [AFGPurchase].[IvL].[Transaction].[PositionId]
WHERE ((([AFGPurchase].[IvL].[TaxTreatment].[RegistrationType]) LIKE 'NON%')
AND (([AFGPurchase].[IvL].[Product].[Sign])='XYZ2')
AND (([AFGPurchase].[IvL].[Position].[Quantity])<>0)
AND (([AFGPurchase].[IvL].[Transaction].[EffectiveDate]) between '2021-12-31' and '2022-12-31'))
If you only need to check for the existence of a row, and not actually get any data from it then use EXISTS() rather than INNER JOIN, e.g.
SELECT vpr.[RelatedNameId] AS CustomerID
,vpr.[RelatedName] AS CustomerName
,tt.[ParticluarType] AS Type
,prd.[Sign]
,prd.ReportingName
,tr.[EffectiveDate] AS [Date]
FROM [AFGPurchase].[IvL].[Account] AS acc
INNER JOIN [AFGPurchase].[IvL].[Position] AS pos ON acc.[AccountId] = pos.[AccountId]
INNER JOIN [AFGPurchase].[IvL].[Product] AS prd ON pos.[ProductID] = prd.[ProductId]
INNER JOIN [ABC].[dbo].[vwPrimary] AS vpr ON acc.[ReportingEntityId] = vpr.[RelatedNameId]
INNER JOIN [AFGPurchase].[IvL].[TaxTreatment] AS tt ON acc.[TaxTreatmentId] = tt.[TaxTreatmentId]
WHERE tt.[RegistrationType] LIKE 'NON%'
AND prd.[Sign]='XYZ2'
AND pos.[Quantity]<>0
AND EXISTS
( SELECT 1
FROM [AFGPurchase].[IvL].[Transaction] AS tr
WHERE tr.[PositionId] = pos.[PositionId]
AND tr.[EffectiveDate] BETWEEN '2021-12-31' AND '2022-12-31'
);
N.B. I have added in table aliases and removed all the unnecessary parentheses for readability - you may disagree that it is more readable, but I would expect that most people would agree
This may not offer any performance benefits over simply grouping by the columns you are selecting and keeping your joins as they are - SQL is after all a declarative language where you tell the engine what you want, not how to get it. So you may find that the two plans are the same because you are requesting the same result. Using EXISTS does have the advance of being more semantically tied to what you are trying to do though, so gives the optimiser the best chance of getting to the right plan. If you are still having performance issues, then you may need to inspect the execution plan, and see if it suggests any indexes.
Finally, if you are really still using SQL Server 2008 then you really need to start thinking about your upgrade path. It has been completely unsupported for over 3 years now.
I have the following SQL Query running in a msaccess database query, that takes several seconds,4 or 5 seconds in a good machines. but in the client machine takes many more...I have to optimize this query.
I try making index in some tables like table Lins and MKT .. And seems to get a little better,
The problem is that I have 8 querys like this running one after the other.. and its gettin really slow with more than 400.000 records
SELECT LINs.LIN,
Sum(VentasDet.Cantidad) AS SumaDeCantidad
FROM ((
(SELECT *
FROM ventas
WHERE (CBTE='COT'
OR CBTE='FCA'
OR CBTE='FCB'
OR CBTE='PR'
OR CBTE='RTO'
OR CBTE='TK')
AND (Suc=0
OR Suc=1
OR Suc=2
OR Suc=4
OR Suc=5)
AND (Caja=0
OR Caja=1
OR Caja=2)
AND (PRDO='12-2017'
OR PRDO='11-2017'
OR PRDO='10-2017'
OR PRDO='09-2017')
AND (MKT=1
OR MKT=2
OR MKT=3
OR MKT=4
OR MKT=5) ) AS Ventas
INNER JOIN
(SELECT *
FROM VentasDet
WHERE LIN<>'0-'
AND LIN<>'1AS'
AND LIN<>'1VF'
AND LIN<>'NEW'
AND LIN<>'OSE'
AND LIN<>'OLJ'
AND LIN<>'OS-O' ) AS VentasDet ON (Ventas.Numero = VentasDet.Numero)
AND (Ventas.Suc = VentasDet.Suc)
AND (Ventas.CBTE = VentasDet.CBTE))
INNER JOIN
(SELECT Codigo,
MaMi
FROM Clis
WHERE MaMi=0
OR MaMi=1
OR MaMi=2 ) AS Clis ON Ventas.Cli = Clis.Codigo)
INNER JOIN LINs ON VentasDet.LIN = LINs.Codigo
GROUP BY VentasDet.LIN,
LINs.LIN
ORDER BY Sum(VentasDet.Cantidad) DESC
Can you try below to see if it make any difference?
1. make a middle table to join CBTE;
2. make a middle table VentasDet;
create table CBTE (CBTE varchar(3));
insert into CBTE values('COT', 'FCA', 'FCB', 'PR','RTO', 'TK');
create table LIN (LIN varchar(4));
insert into LIN values('0-','1AS','1VF','NEW','OSE','OLJ','OS-O');
( SELECT *
FROM VentasDet V
left join LIN L on v.Lin=L.Lin where L.Lin is null) AS VentasDet
I have a query against a table that contains like 2 million rows using linked server.
Select * from OPENQUERY(LinkedServerName,
'SELECT
PV.col1
,PV.col2
,PV.col3
,VTR.col1
,CTR.col1
,PSR.col1
FROM
LinkedDbName.dbo.tbl1 PV
INNER JOIN LinkedDbName.dbo.tbl2 VTR
ON PV.col_id = VTR.col_id
INNER JOIN LinkedDbName.dbo.tbl3 CTR
ON PV.col_id = CTR.col_id
INNER JOIN LinkedDbName.dbo.tbl4 PSR
ON PV.col_id = PSR.col_id
WHERE
PV.col_id = ''80C53C9B-6272-11DA-BB34-000E0C7F3ED2''')
That query results into 365 rows and is executed within 0 second.
However when I make that query into a view it runs for about minimum of 20 seconds and sometimes it reaches to 40 seconds tops.
Here's my create view script
CREATE VIEW [dbo].[myview]
AS
Select * from OPENQUERY(LinkedServerName,
'SELECT
PV.col1
,PV.col2
,PV.col3
,VTR.col1
,CTR.col1
,PSR.col1
FROM
LinkedDbName.dbo.tbl1 PV
INNER JOIN LinkedDbName.dbo.tbl2 VTR
ON PV.col_id = VTR.col_id
INNER JOIN LinkedDbName.dbo.tbl3 CTR
ON PV.col_id = CTR.col_id
INNER JOIN LinkedDbName.dbo.tbl4 PSR
ON PV.col_id = PSR.col_id')
then
Select * from myview where PV.col_id = '80C53C9B-6272-11DA-BB34-000E0C7F3ED2'
Any idea ? Thanks !
Your queries are quite different. In the first, the where clause is part of the SQL statement passed to OPENQUERY(). This has two important effects:
The amount of data returned is much smaller, only being the rows that match the condition.
The query can be optimized with the WHERE clause.
If you need to share the table, I might suggest that you make a copy on the local server -- either using replication or scheduling a job to copy it over.
I have a SQL query with many left joins
SELECT COUNT(DISTINCT po.o_id)
FROM T_PROPOSAL_INFO po
LEFT JOIN T_PLAN_TYPE tp ON tp.plan_type_id = po.Plan_Type_Fk
LEFT JOIN T_PRODUCT_TYPE pt ON pt.PRODUCT_TYPE_ID = po.cust_product_type_fk
LEFT JOIN T_PROPOSAL_TYPE prt ON prt.PROPTYPE_ID = po.proposal_type_fk
LEFT JOIN T_BUSINESS_SOURCE bs ON bs.BUSINESS_SOURCE_ID = po.CONT_AGT_BRK_CHANNEL_FK
LEFT JOIN T_USER ur ON ur.Id = po.user_id_fk
LEFT JOIN T_ROLES ro ON ur.roleid_fk = ro.Role_Id
LEFT JOIN T_UNDERWRITING_DECISION und ON und.O_Id = po.decision_id_fk
LEFT JOIN T_STATUS st ON st.STATUS_ID = po.piv_uw_status_fk
LEFT OUTER JOIN T_MEMBER_INFO mi ON mi.proposal_info_fk = po.O_ID
WHERE 1 = 1
AND po.CUST_APP_NO LIKE '%100010233976%'
AND 1 = 1
AND po.IS_STP <> 1
AND po.PIV_UW_STATUS_FK != 10
The performance seems to be not good and I would like to optimize the query.
Any suggestions please?
Try this one -
SELECT COUNT(DISTINCT po.o_id)
FROM T_PROPOSAL_INFO po
WHERE PO.CUST_APP_NO LIKE '%100010233976%'
AND PO.IS_STP <> 1
AND po.PIV_UW_STATUS_FK != 10
First, check your indexes. Are they old? Did they get fragmented? Do they need rebuilding?
Then, check your "execution plan" (varies depending on the SQL Engine): are all joins properly understood? Are some of them 'out of order'? Do some of them transfer too many data?
Then, check your plan and indexes: are all important columns covered? Are there any outstandingly lengthy table scans or joins? Are the columns in indexes IN ORDER with the query?
Then, revise your query:
- can you extract some parts that normally would quickly generate small rowset?
- can you add new columns to indexes so join/filter expressions will get covered?
- or reorder them so they match the query better?
And, supporting the solution from #Devart:
Can you eliminate some tables on the way? does the where touch the other tables at all? does the data in the other tables modify the count significantly? If neither SELECT nor WHERE never touches the other joined columns, and if the COUNT exact value is not that important (i.e. does that T_PROPOSAL_INFO exist?) then you might remove all the joins completely, as Devart suggested. LEFTJOINs never reduce the number of rows. They only copy/expand/multiply the rows.
I'm definitely not a DBA and unfortunately we don't have a DBA to consult within at our company. I was wondering if someone could give me a recommendation on how to improve this query, either by changing the query itself or adding indexes to the database.
Looking at the execution plan of the query it seems like the outer joins are killing the query. This query only returns 350k results, but it takes almost 30 seconds to complete. I don't know much about DB's, but I don't think this is good? Perhaps I'm wrong?
Any suggestions would be greatly appreciated. Thanks in advance.
As a side note this is obviously being create by an ORM and not me directly. We are using Linq-to-SQL.
SELECT
[t12].[value] AS [DiscoveryEnabled],
[t12].[value2] AS [isConnected],
[t12].[Interface],
[t12].[Description] AS [InterfaceDescription],
[t12].[value3] AS [Duplex],
[t12].[value4] AS [IsEnabled],
[t12].[value5] AS [Host],
[t12].[value6] AS [HostIP],
[t12].[value7] AS [MAC],
[t12].[value8] AS [MACadded],
[t12].[value9] AS [PortFast],
[t12].[value10] AS [PortSecurity],
[t12].[value11] AS [ShortHost],
[t12].[value12] AS [SNMPlink],
[t12].[value13] AS [Speed],
[t12].[value14] AS [InterfaceStatus],
[t12].[InterfaceType],
[t12].[value15] AS [IsUserPort],
[t12].[value16] AS [VLAN],
[t12].[value17] AS [Code],
[t12].[Description2] AS [Description],
[t12].[Host] AS [DeviceName],
[t12].[NET_OUID],
[t12].[DisplayName] AS [Net_OU],
[t12].[Enclave]
FROM (
SELECT
[t1].[DiscoveryEnabled] AS [value],
[t1].[IsConnected] AS [value2],
[t0].[Interface],
[t0].[Description],
[t2].[Duplex] AS [value3],
[t0].[IsEnabled] AS [value4],
[t3].[Host] AS [value5],
[t6].[Address] AS [value6],
[t3].[MAC] AS [value7],
[t3].[MACadded] AS [value8],
[t2].[PortFast] AS [value9],
[t2].[PortSecurity] AS [value10],
[t4].[Host] AS [value11],
[t0].[SNMPlink] AS [value12],
[t2].[Speed] AS [value13],
[t2].[InterfaceStatus] AS [value14],
[t8].[InterfaceType],
[t0].[IsUserPort] AS [value15],
[t2].[VLAN] AS [value16],
[t9].[Code] AS [value17],
[t9].[Description] AS [Description2],
[t7].[Host], [t7].[NET_OUID],
[t10].[DisplayName],
[t11].[Enclave],
[t7].[Decommissioned]
FROM [dbo].[IDB_Interface] AS [t0]
LEFT OUTER JOIN [dbo].[IDB_InterfaceLayer2] AS [t1] ON [t0].[IDB_Interface_ID] = [t1].[IDB_Interface_ID]
LEFT OUTER JOIN [dbo].[IDB_LANinterface] AS [t2] ON [t1].[IDB_InterfaceLayer2_ID] = [t2].[IDB_InterfaceLayer2_ID]
LEFT OUTER JOIN [dbo].[IDB_Host] AS [t3] ON [t2].[IDB_LANinterface_ID] = [t3].[IDB_LANinterface_ID]
LEFT OUTER JOIN [dbo].[IDB_Infrastructure] AS [t4] ON [t0].[IDB_Interface_ID] = [t4].[IDB_Interface_ID]
LEFT OUTER JOIN [dbo].[IDB_AddressMapIPv4] AS [t5] ON [t3].[IDB_AddressMapIPv4_ID] = ([t5].[IDB_AddressMapIPv4_ID])
LEFT OUTER JOIN [dbo].[IDB_AddressIPv4] AS [t6] ON [t5].[IDB_AddressIPv4_ID] = [t6].[IDB_AddressIPv4_ID]
INNER JOIN [dbo].[ART_Asset] AS [t7] ON [t7].[ART_Asset_ID] = [t0].[ART_Asset_ID]
LEFT OUTER JOIN [dbo].[NSD_InterfaceType] AS [t8] ON [t8].[NSD_InterfaceTypeID] = [t0].[NSD_InterfaceTypeID]
INNER JOIN [dbo].[NSD_InterfaceCode] AS [t9] ON [t9].[NSD_InterfaceCodeID] = [t0].[NSD_InterfaceCodeID]
INNER JOIN [dbo].[NET_OU] AS [t10] ON [t10].[NET_OUID] = [t7].[NET_OUID]
INNER JOIN [dbo].[NET_Enclave] AS [t11] ON [t11].[NET_EnclaveID] = [t10].[NET_EnclaveID]
) AS [t12]
WHERE ([t12].[Enclave] = 'USMC') AND (NOT ([t12].[Decommissioned] = 1))
LINQ-TO-SQL Query:
return from t in db.IDB_Interfaces
join v in db.IDB_InterfaceLayer3s on t.IDB_Interface_ID equals v.IDB_Interface_ID
join u in db.ART_Assets on t.ART_Asset_ID equals u.ART_Asset_ID
join c in db.NET_OUs on u.NET_OUID equals c.NET_OUID
join w in
(from d in db.IDB_InterfaceIPv4s
select new { d.IDB_InterfaceIPv4_ID, d.IDB_InterfaceLayer3_ID, d.IDB_AddressMapIPv4_ID, d.IDB_AddressMapIPv4.IDB_AddressIPv4.Address })
on v.IDB_InterfaceLayer3_ID equals w.IDB_InterfaceLayer3_ID
join h in db.NET_Enclaves on c.NET_EnclaveID equals h.NET_EnclaveID into enclaveLeftJoin
from i in enclaveLeftJoin.DefaultIfEmpty()
join m in
(from z in db.IDB_StandbyIPv4s
select new
{
z.IDB_InterfaceIPv4_ID,
z.IDB_AddressMapIPv4_ID,
z.IDB_AddressMapIPv4.IDB_AddressIPv4.Address,
z.Preempt,
z.Priority
})
on w.IDB_InterfaceIPv4_ID equals m.IDB_InterfaceIPv4_ID into standbyLeftJoin
from k in standbyLeftJoin.DefaultIfEmpty()
where t.ART_Asset.Decommissioned == false
select new NetIDBGridDataResults
{
DeviceName = u.Host,
Host = u.Host,
Interface = t.Interface,
IPAddress = w.Address,
ACLIn = v.InboundACL,
ACLOut = v.OutboundACL,
VirtualAddress = k.Address,
VirtualPriority = k.Priority,
VirtualPreempt = k.Preempt,
InterfaceDescription = t.Description,
Enclave = i.Enclave
};
As a rule (and this is very general), you want an index on:
JOIN fields (both sides)
Common WHERE filter fields
Possibly fields you aggregate
For this query, start with checking your JOIN criteria. Any one of those missing will force a table scan which is a big hit.
Looking at the execution plan of the query it seems like the outer joins are killing the query.
This query only returns 350k results, but it takes almost 30 seconds to complete. I don't know
much about DB's, but I don't think this is good? Perhaps I'm wrong?
A man has got to do waht a mana has got to do.
The joins may kill you, but when you need them YOU NEED THEM. Some tasks take long.
Make sure you ahve all indices you need.
Make sure your sql server is not a sad joke hardware wise.
All you can do.
I woudl bet someone has no clue about SQL and needs to be enlighted to the power of indices.