Which is faster: WHERE or HAVING? - sql

I'm trying to build a query in Microsoft Access and something weird is happening with the WHERE and HAVING clauses. While experimenting with using WHERE and HAVING on a date field (tblDailyFactor.Date) I discovered that including a filter in both the WHERE and HAVING clauses makes the query run significantly faster than simply including the filter in the WHERE clause alone. I used the VBA Timer function to measure the processing time and listed those below along with the two different sets of code.
Can anyone help me to understand why this is happening? I'd like to implement this into my query but I need to be able to explain and justify the code. Thanks!
Run time: 1.01 seconds
SELECT tblDailyFactor.Date, tblStaticData1.Name
FROM ((((tblSubTransactions INNER JOIN tblTransactions ON tblSubTransactions.ReferenceNumber = tblTransactions.ReferenceNumber)
INNER JOIN tblAccounts ON tblTransactions.LocalID = tblAccounts.LocalID)
INNER JOIN tblStaticData2 ON tblAccounts.LocalID = tblStaticData2.LocalID)
INNER JOIN tblStaticData1 ON tblStaticData2.GlobalID = tblStaticData1.GlobalID)
INNER JOIN tblDailyFactor ON tblStaticData1.GlobalID = tblDailyFactor.GlobalID
WHERE (((tblTransactions.Date)<=#8/31/2022#) AND ((tblAccounts.Status)='Active') AND ((tblDailyFactor.Date)=#8/31/2022#))
GROUP BY tblDailyFactor.Date, tblStaticData1.Name
HAVING ((Sum(tblSubTransactions.BalanceUSD))>0.009);
Run time: 0.16 seconds
SELECT tblDailyFactor.Date, tblStaticData1.Name
FROM ((((tblSubTransactions INNER JOIN tblTransactions ON tblSubTransactions.ReferenceNumber = tblTransactions.ReferenceNumber)
INNER JOIN tblAccounts ON tblTransactions.LocalID = tblAccounts.LocalID)
INNER JOIN tblStaticData2 ON tblAccounts.LocalID = tblStaticData2.LocalID)
INNER JOIN tblStaticData1 ON tblStaticData2.GlobalID = tblStaticData1.GlobalID)
INNER JOIN tblDailyFactor ON tblStaticData1.GlobalID = tblDailyFactor.GlobalID
WHERE (((tblTransactions.Date)<=#8/31/2022#) AND ((tblAccounts.Status)='Active') AND ((tblDailyFactor.Date)=#8/31/2022#))
GROUP BY tblDailyFactor.Date, tblStaticData1.Name
HAVING (((Sum(tblSubTransactions.BalanceUSD))>0.009) AND ((tblDailyFactor.Date)=#8/31/2022#));
(running Microsoft 365 MSO , Version 2110 Build 16.0.14527.20234 64-bit )

Related

Combining two SQL select statements takes 30 to 40 minutes to get executed

RDBMS - MS SQL
How can I combine these two SQL select queries into one and get them executed quicly:
Query 1
select
VVO.VV_CODE,
V.Vessel_name,
VVO.Arrival_date,
isnull(IGM.VIR_NO,'NULL') as VIR_NO,
isnull(VVO.TERMINAL_CODE,'NULL') as TERMINAL_CODE
from Vessel_voyage VVO, Vessel V, IGM
where V.Vessel_code = substring(VVO.VV_CODE,1,3) and VVO.VV_CODE = IGM.VV_CODE
Query 2
select
BLD.BL_NO,
isnull(BLD.Parent_BL,'NULL') as Parent_BL,
BLD.Consignee_Description,
DO.DO_Issue_Date,
CA.CAgent_Name,
BLC.Container_No ,
CS.Container_Size_Description
from BL_DATA BLD, CAgent CA, Delivery_Order DO, BL_Container BLC, Container_Size CS
where BLD.BL_NO = DO.BL_NO and DO.CAgent_Code = CA.CAgent_Code and BLD.BL_NO = BLC.BL_NO and BLC.Container_Size_Code = CS.Container_Size_Code
Executing these select queries individually, they get executed within seconds.
But making them into a single select query, they take around 30 to 40 minutes to get executed.
This is what I tried:
select
VVO.VV_CODE,
V.Vessel_name,
VVO.Arrival_date,
isnull(IGM.VIR_NO,'NULL') as VIR_NO,
isnull(VVO.TERMINAL_CODE,'NULL') as TERMINAL_CODE,
BLD.BL_NO,
isnull(BLD.Parent_BL,'NULL') as Parent_BL,
BLD.Consignee_Description,
DO.DO_Issue_Date,
CA.CAgent_Name,
BLC.Container_No ,
CS.Container_Size_Description
from Vessel_voyage VVO, Vessel V, IGM ,BL_DATA BLD, CAgent CA, Delivery_Order DO, BL_Container BLC, Container_Size CS
where V.Vessel_code = substring(VVO.VV_CODE,1,3) and VVO.VV_CODE = IGM.VV_CODE and BLD.BL_NO = DO.BL_NO and DO.CAgent_Code = CA.CAgent_Code and BLD.BL_NO = BLC.BL_NO and BLC.Container_Size_Code = CS.Container_Size_Code
Writing the query using ANSI style joins would make it easier to read - so I've done exactly that. It also helps spot where there are problems within the join logic.
Re-writing the joins I get to this :
from Vessel_voyage VVO
inner join Vessel V on V.Vessel_code = substring(VVO.VV_CODE,1,3)
inner join IGM on VVO.VV_CODE = IGM.VV_CODE
inner join BL_DATA BLD on BLD.BL_NO = DO.BL_NO
inner join CAgent CA
inner join Delivery_Order DO on DO.CAgent_Code = CA.CAgent_Code
inner join BL_Container BLC on BLD.BL_NO = BLC.BL_NO
inner join Container_Size CS on BLC.Container_Size_Code = CS.Container_Size_Code
I could move around the DO to CA join predicate but ultimately I have 8 tables being joined, but only 6 predicates joining them - net result is that it is cartesian'ing in one of the tables which is likely to give incorrect results, but most certainly will cause a performance degradation.
If you use this style of join I suspect you will be able to fix it very easily.

MS Access INNER JOIN/LEFT JOIN problems

I have the following SQL string which tries to combine an INNER JOIN with a LEFT JOIN in the FROM section.
As you can see I use table VIP_APP_VIP_SCENARIO_DETAIL_LE to perform the query. When I use it against this table, Access give me an "Invalid Operation" error.
Interestingly, when I use the EXACT same query using the VIP_APP_VIP_SCENARIO_DETAIL_BUDGET or VIP_APP_VIP_SCENARIO_DETAIL_ACTUALS table, it performs flawlessly.
So why would it work on two tables but not the other? All fields are in all tables and the data types are correct.
As a side note: on the query with the error, if I change the LEFT JOIN to an INNER JOIN, it runs with no problem! I really need a LEFT JOIN though.
SELECT
D.MATERIAL_NUMBER,
D.MATERIAL_DESCRIPTION,
D.PRODUCTION_LOT_SIZE,
D.STANDARDS_NAME,
D.WORK_CENTER,
S.OP_SHORT_TEXT,
S.OPERATION_CODE,
D.LINE_SPEED_UPM,
D.PERCENT_STD,
D.EQUIPMENT_SU,
D.EQUIPMENT_CU,
D.OPERATOR_NUM,
V.COSTING_LOT_SIZE,
V.VOL_TOTAL_ADJ
FROM
([STDS_SCENARIO: TEST] AS D INNER JOIN MASTER_SUMMARY AS S ON
D.MATERIAL_NUMBER = S.MATERIAL_NUMBER AND D.WORK_CENTER = S.WORK_CENTER)
LEFT JOIN
(SELECT ITEM_CODE, COSTING_LOT_SIZE, VOL_TOTAL_ADJ
FROM
VIP_APP_VIP_SCENARIO_DETAIL_LE
WHERE SCENARIO_ID = 16968) AS V ON D.MATERIAL_NUMBER = V.ITEM_CODE
ORDER BY D.MATERIAL_NUMBER, D.STANDARDS_NAME, S.OPERATION_CODE;
tried to mock this up in SQL server with some tables of my own, but the structure seemed to work, this follows the pattern referenced above. (hopefully no syntax errors left here)
SELECT * FROM (
select
D.MATERIAL_NUMBER,
D.MATERIAL_DESCRIPTION,
D.PRODUCTION_LOT_SIZE,
D.STANDARDS_NAME,
D.WORK_CENTER,
S.OP_SHORT_TEXT,
S.OPERATION_CODE,
D.LINE_SPEED_UPM,
D.PERCENT_STD,
D.EQUIPMENT_SU,
D.EQUIPMENT_CU,
D.OPERATOR_NUM
FROM [STDS_SCENARIO: TEST] D
INNER JOIN MASTER_SUMMARY S
ON D.MATERIAL_NUMBER = S.MATERIAL_NUMBER AND D.WORK_CENTER = S.WORK_CENTER) AS J
LEFT JOIN
(SELECT ITEM_CODE, COSTING_LOT_SIZE, VOL_TOTAL_ADJ
FROM
VIP_APP_VIP_SCENARIO_DETAIL_LE
WHERE SCENARIO_ID = 16968) AS V ON J.MATERIAL_NUMBER = V.ITEM_CODE
ORDER BY J.MATERIAL_NUMBER, J.STANDARDS_NAME, J.OPERATION_CODE;
Had help from a friend and we discovered that it was a casting problem between a linked Oracle table and the Access table. To fix the problem we casted both sides of the linked fields to a string:
CSTR(D.[MATERIAL_NUMBER]) = CSTR(V.[ITEM_CODE])

Select from view takes too long

I have a query against a table that contains like 2 million rows using linked server.
Select * from OPENQUERY(LinkedServerName,
'SELECT
PV.col1
,PV.col2
,PV.col3
,VTR.col1
,CTR.col1
,PSR.col1
FROM
LinkedDbName.dbo.tbl1 PV
INNER JOIN LinkedDbName.dbo.tbl2 VTR
ON PV.col_id = VTR.col_id
INNER JOIN LinkedDbName.dbo.tbl3 CTR
ON PV.col_id = CTR.col_id
INNER JOIN LinkedDbName.dbo.tbl4 PSR
ON PV.col_id = PSR.col_id
WHERE
PV.col_id = ''80C53C9B-6272-11DA-BB34-000E0C7F3ED2''')
That query results into 365 rows and is executed within 0 second.
However when I make that query into a view it runs for about minimum of 20 seconds and sometimes it reaches to 40 seconds tops.
Here's my create view script
CREATE VIEW [dbo].[myview]
AS
Select * from OPENQUERY(LinkedServerName,
'SELECT
PV.col1
,PV.col2
,PV.col3
,VTR.col1
,CTR.col1
,PSR.col1
FROM
LinkedDbName.dbo.tbl1 PV
INNER JOIN LinkedDbName.dbo.tbl2 VTR
ON PV.col_id = VTR.col_id
INNER JOIN LinkedDbName.dbo.tbl3 CTR
ON PV.col_id = CTR.col_id
INNER JOIN LinkedDbName.dbo.tbl4 PSR
ON PV.col_id = PSR.col_id')
then
Select * from myview where PV.col_id = '80C53C9B-6272-11DA-BB34-000E0C7F3ED2'
Any idea ? Thanks !
Your queries are quite different. In the first, the where clause is part of the SQL statement passed to OPENQUERY(). This has two important effects:
The amount of data returned is much smaller, only being the rows that match the condition.
The query can be optimized with the WHERE clause.
If you need to share the table, I might suggest that you make a copy on the local server -- either using replication or scheduling a job to copy it over.

SQL Query: Comparing two dates in returned record

I'm trying to come up with an automated solution for something I do manually now and I only have minimal, bare-bones SQL skill. I usually modify simple queries others have built or will build basic select queries. I have done some reading but don't know how to make it do what I need in this case. I need to come up with something others can use while I am out for a month (and which will save me time when I return).
What I need is to return the fields below where tblThree.EndDate is later than tblFive.ServiceEnd. I have to do a couple of other compares on the dates, but if I get a working query of the first one I can make it work with the others. We use MS SQL Server 2008.
I tried creating sub-queries with aliases and failed miserably at making it work.
These are the table and fields I am working with:
tblOne.ServiceID
tblOne.ServiceYear
tblOne.Status
tblTwo.AccountNbr
tblTwo.AcctName
tblThree.BeginDate (smalldatetime, null)
tblThree.EndDate (smalldatetime, null)
tblFour.ClientID
tblFour.ServiceName
tblFive.ContractID
tblFive.ServiceBegin (smalldatetime, null)
tblFive.ServiceEnd (smalldatetime, null)
This is how the tables are related:
tblOne.ServiceID = tblThree.ServiceID
tblOne.ContractID = tblFive.ContractID
tblOne.ClientID = tblFour.ClientID
tblTwo.AccountNbr = tblFour.Account
I used MS Access 2003 to generate the Join SQL:
SELECT tblOne.ServiceID, tblTwo.AccountNbr,
tblTwo.AcctName, tblFour.ServiceName, tblOne.Status,
tblThree.BeginDate, tblThree.EndDate,
tblOne.ServiceYear, tblFive.ServiceBegin,
tblFive.ServiceEnd
FROM ((tblTwo INNER JOIN tblFour
ON tblTwo.AccountNbr=tblFour.AccountNbr) INNER JOIN (tblThree INNER JOIN tblOne
ON tblThree.ServiceID=tblOne.ServiceID)
ON tblFour.ClientID=tblOne.ClientID) INNER JOIN tblFive
ON tblOne.ContractID=tblFive.ContractID;
Thanks for any help.
Just add a WHERE clause to get started:
SELECT tblOne.ServiceID, tblTwo.AccountNbr,
tblTwo.AcctName, tblFour.ServiceName, tblOne.Status,
tblThree.BeginDate, tblThree.EndDate,
tblOne.ServiceYear, tblFive.ServiceBegin,
tblFive.ServiceEnd
FROM ((tblTwo INNER JOIN tblFour
ON tblTwo.AccountNbr=tblFour.AccountNbr) INNER JOIN (tblThree INNER JOIN tblOne
ON tblThree.ServiceID=tblOne.ServiceID)
ON tblFour.ClientID=tblOne.ClientID) INNER JOIN tblFive
ON tblOne.ContractID=tblFive.ContractID
WHERE tblThree.EndDate > tblFive.ServiceEnd;
SELECT
tblOne.ServiceID,
tblOne.ServiceYear,
tblOne.Status,
tblTwo.AccountNbr,
tblTwo.AcctName,
tblThree.BeginDate,
tblThree.EndDate,
tblFour.ClientID,
tblFour.ServiceName,
tblFive.ContractID,
tblFive.ServiceBegin,
tblFive.ServiceEnd
FROM tblOne
INNER JOIN tblThree
ON tblOne.ServiceID = tblThree.ServiceID
INNER JOIN tblFive
ON tblOne.ContractID = tblFive.ContractID
INNER JOIN tblFour
ON tblOne.ClientID = tblFour.ClientID
INNER JOIN tblTwo
ON tblTwo.AccountNbr = tblFour.Account
WHERE tblThree.EndDate > tblFive.ServiceEnd

Need help optimizing this tSQL Query

I'm definitely not a DBA and unfortunately we don't have a DBA to consult within at our company. I was wondering if someone could give me a recommendation on how to improve this query, either by changing the query itself or adding indexes to the database.
Looking at the execution plan of the query it seems like the outer joins are killing the query. This query only returns 350k results, but it takes almost 30 seconds to complete. I don't know much about DB's, but I don't think this is good? Perhaps I'm wrong?
Any suggestions would be greatly appreciated. Thanks in advance.
As a side note this is obviously being create by an ORM and not me directly. We are using Linq-to-SQL.
SELECT
[t12].[value] AS [DiscoveryEnabled],
[t12].[value2] AS [isConnected],
[t12].[Interface],
[t12].[Description] AS [InterfaceDescription],
[t12].[value3] AS [Duplex],
[t12].[value4] AS [IsEnabled],
[t12].[value5] AS [Host],
[t12].[value6] AS [HostIP],
[t12].[value7] AS [MAC],
[t12].[value8] AS [MACadded],
[t12].[value9] AS [PortFast],
[t12].[value10] AS [PortSecurity],
[t12].[value11] AS [ShortHost],
[t12].[value12] AS [SNMPlink],
[t12].[value13] AS [Speed],
[t12].[value14] AS [InterfaceStatus],
[t12].[InterfaceType],
[t12].[value15] AS [IsUserPort],
[t12].[value16] AS [VLAN],
[t12].[value17] AS [Code],
[t12].[Description2] AS [Description],
[t12].[Host] AS [DeviceName],
[t12].[NET_OUID],
[t12].[DisplayName] AS [Net_OU],
[t12].[Enclave]
FROM (
SELECT
[t1].[DiscoveryEnabled] AS [value],
[t1].[IsConnected] AS [value2],
[t0].[Interface],
[t0].[Description],
[t2].[Duplex] AS [value3],
[t0].[IsEnabled] AS [value4],
[t3].[Host] AS [value5],
[t6].[Address] AS [value6],
[t3].[MAC] AS [value7],
[t3].[MACadded] AS [value8],
[t2].[PortFast] AS [value9],
[t2].[PortSecurity] AS [value10],
[t4].[Host] AS [value11],
[t0].[SNMPlink] AS [value12],
[t2].[Speed] AS [value13],
[t2].[InterfaceStatus] AS [value14],
[t8].[InterfaceType],
[t0].[IsUserPort] AS [value15],
[t2].[VLAN] AS [value16],
[t9].[Code] AS [value17],
[t9].[Description] AS [Description2],
[t7].[Host], [t7].[NET_OUID],
[t10].[DisplayName],
[t11].[Enclave],
[t7].[Decommissioned]
FROM [dbo].[IDB_Interface] AS [t0]
LEFT OUTER JOIN [dbo].[IDB_InterfaceLayer2] AS [t1] ON [t0].[IDB_Interface_ID] = [t1].[IDB_Interface_ID]
LEFT OUTER JOIN [dbo].[IDB_LANinterface] AS [t2] ON [t1].[IDB_InterfaceLayer2_ID] = [t2].[IDB_InterfaceLayer2_ID]
LEFT OUTER JOIN [dbo].[IDB_Host] AS [t3] ON [t2].[IDB_LANinterface_ID] = [t3].[IDB_LANinterface_ID]
LEFT OUTER JOIN [dbo].[IDB_Infrastructure] AS [t4] ON [t0].[IDB_Interface_ID] = [t4].[IDB_Interface_ID]
LEFT OUTER JOIN [dbo].[IDB_AddressMapIPv4] AS [t5] ON [t3].[IDB_AddressMapIPv4_ID] = ([t5].[IDB_AddressMapIPv4_ID])
LEFT OUTER JOIN [dbo].[IDB_AddressIPv4] AS [t6] ON [t5].[IDB_AddressIPv4_ID] = [t6].[IDB_AddressIPv4_ID]
INNER JOIN [dbo].[ART_Asset] AS [t7] ON [t7].[ART_Asset_ID] = [t0].[ART_Asset_ID]
LEFT OUTER JOIN [dbo].[NSD_InterfaceType] AS [t8] ON [t8].[NSD_InterfaceTypeID] = [t0].[NSD_InterfaceTypeID]
INNER JOIN [dbo].[NSD_InterfaceCode] AS [t9] ON [t9].[NSD_InterfaceCodeID] = [t0].[NSD_InterfaceCodeID]
INNER JOIN [dbo].[NET_OU] AS [t10] ON [t10].[NET_OUID] = [t7].[NET_OUID]
INNER JOIN [dbo].[NET_Enclave] AS [t11] ON [t11].[NET_EnclaveID] = [t10].[NET_EnclaveID]
) AS [t12]
WHERE ([t12].[Enclave] = 'USMC') AND (NOT ([t12].[Decommissioned] = 1))
LINQ-TO-SQL Query:
return from t in db.IDB_Interfaces
join v in db.IDB_InterfaceLayer3s on t.IDB_Interface_ID equals v.IDB_Interface_ID
join u in db.ART_Assets on t.ART_Asset_ID equals u.ART_Asset_ID
join c in db.NET_OUs on u.NET_OUID equals c.NET_OUID
join w in
(from d in db.IDB_InterfaceIPv4s
select new { d.IDB_InterfaceIPv4_ID, d.IDB_InterfaceLayer3_ID, d.IDB_AddressMapIPv4_ID, d.IDB_AddressMapIPv4.IDB_AddressIPv4.Address })
on v.IDB_InterfaceLayer3_ID equals w.IDB_InterfaceLayer3_ID
join h in db.NET_Enclaves on c.NET_EnclaveID equals h.NET_EnclaveID into enclaveLeftJoin
from i in enclaveLeftJoin.DefaultIfEmpty()
join m in
(from z in db.IDB_StandbyIPv4s
select new
{
z.IDB_InterfaceIPv4_ID,
z.IDB_AddressMapIPv4_ID,
z.IDB_AddressMapIPv4.IDB_AddressIPv4.Address,
z.Preempt,
z.Priority
})
on w.IDB_InterfaceIPv4_ID equals m.IDB_InterfaceIPv4_ID into standbyLeftJoin
from k in standbyLeftJoin.DefaultIfEmpty()
where t.ART_Asset.Decommissioned == false
select new NetIDBGridDataResults
{
DeviceName = u.Host,
Host = u.Host,
Interface = t.Interface,
IPAddress = w.Address,
ACLIn = v.InboundACL,
ACLOut = v.OutboundACL,
VirtualAddress = k.Address,
VirtualPriority = k.Priority,
VirtualPreempt = k.Preempt,
InterfaceDescription = t.Description,
Enclave = i.Enclave
};
As a rule (and this is very general), you want an index on:
JOIN fields (both sides)
Common WHERE filter fields
Possibly fields you aggregate
For this query, start with checking your JOIN criteria. Any one of those missing will force a table scan which is a big hit.
Looking at the execution plan of the query it seems like the outer joins are killing the query.
This query only returns 350k results, but it takes almost 30 seconds to complete. I don't know
much about DB's, but I don't think this is good? Perhaps I'm wrong?
A man has got to do waht a mana has got to do.
The joins may kill you, but when you need them YOU NEED THEM. Some tasks take long.
Make sure you ahve all indices you need.
Make sure your sql server is not a sad joke hardware wise.
All you can do.
I woudl bet someone has no clue about SQL and needs to be enlighted to the power of indices.