I have WPF application which task is to drag data from Interbase DB. Note, that this DB is located on the remote network device. Also, Firebird ado.net data provider is used.
One of my query looks like:
SELECT
T1.ind_st,
T2.ttt,
T2.tdtdtd,
sumr
FROM ((SELECT ind_st,
Sum(r) AS sumR
FROM (SELECT ind_st,
rrr AS r
FROM srok_tel
WHERE date_ch = '23.07.2018 0:00:00'
AND srok_ch = '18'
AND ind_st >= 33049
AND ind_st <= 34717
UNION
SELECT ind_st,
-rrr AS r
FROM srok_tel
WHERE date_ch = '23.07.2018 0:00:00'
AND srok_ch = '12'
AND ind_st >= 33049
AND ind_st <= 34717
UNION
SELECT ind_st,
rrr AS r
FROM srok_tel
WHERE date_ch = '24.07.2018 0:00:00'
AND srok_ch IN ( 6, 12 )
AND ind_st >= 33049
AND ind_st <= 34717)
GROUP BY ind_st) T1
JOIN (SELECT ind_st,
ttt,
tdtdtd
FROM srok_tel
WHERE date_ch = '24.07.2018 0:00:00'
AND srok_ch = '12'
AND ind_st >= 33049
AND ind_st <= 34717) T2
ON T1.ind_st = T2.ind_st)
Yes, heavy, hard to read at first look and probably written in a wrong way, but my task is to drag all data with one query and I am NOT sql pro.
Target table (SROK_TEL), from with data is selecting, contains aproximately 10^7 rows. Query run time is about 90 seconds, which is significantly more, then I wish to see.
Any suggestions about how to make this query work faster?
UPDATE1: On luisarcher's request I've added a query plan (hope that's exactly what he asked for)
PLAN JOIN (SORT ((T1 SROK_TEL NATURAL)
PLAN (T1 SROK_TEL NATURAL)
PLAN (T1 SROK_TEL NATURAL)), T2 SROK_TEL INDEX (PK_SROK_TEL))
I've had an issue like yours not long ago, so I'll share some tips that apply to your situation:
1) If you don't mind having duplicates, you can use UNION ALL instead of UNION. You can see why here
2) Restrict the data you use. This one is important; I got about 90% of execution time reduced by correctly removing data I don't need from the query (more specific where clauses, not selecting useless data).
3) Check if you can add an index in your table srok_tel.
Related
UPDATE: Changed title. Previous title "Does UNION instead of OR always speed up queries?"
Here is my query. The question is concerning the second last line with the OR:
SELECT distinct bigUnionQuery.customer
FROM ((SELECT buyer.customer
FROM membership_vw buyer
JOIN account_vw account
ON account.buyer = buyer.id
WHERE account.closedate >= 'some_date')
UNION
(SELECT joint.customer
FROM entity_vw joint
JOIN transactorassociation_vw assoc
ON assoc.associatedentity = joint.id
JOIN account_vw account
ON account.buyer = assoc.entity
WHERE assoc.account is null and account.closedate >= 'some_date')
UNION
(SELECT joint.customer
FROM entity_vw joint
JOIN transactorassociation_vw assoc
ON assoc.associatedentity = joint.id
JOIN account_vw account
ON account.id = assoc.account
WHERE account.closedate >= '2021-02-11 00:30:22.339'))
AS bigUnionQuery
JOIN entity_vw
ON entity_vw.customer = bigUnionQuery.customer OR entity_vw.id = bigUnionQuery.customer
WHERE entity_vw.lastmodifieddate >= 'some_date';
The original query doesn't have the OR in the second last line. Adding the OR here has slowed down the query. I'm wondering if there is a way to use UNION here to speed it up.
I tried doing (pseudo):
bigUnionQuery bq join entity_vw e on e.customer = bq.customer
union
bigUnionQuery bq join entity_vw e on e.id = bq.customer
But that slowed down the query even more, probably because the bigUnionQuery is a large, slow query, and running it twice in the UNION is not the correct way. What would be the right way to use UNION here, or is it always going to be faster with OR?
Does UNION instead of OR always speed up queries? In some cases it does. I think it depends on your indexes too. I have worked on tables with 1 million records and my queries' speed usually improves if I use union instead of 'or' or 'and'.
Wrote a view in a database. The view takes 0 seconds to run when called from 1 database and 2.5 minutes when called from another.
I have created a video that best describes this problem. Watch it here: https://youtu.be/jEqI2bUyelQ
I tired to re create the view by dropping it.
I tried to compare the query execution plans, they are different when run with 1 database vs the other.
I looked into the query it self and noticed that if you remove the where clause the performance is regained and it takes the same amount of time for both.
Expected results are that it should take 0 seconds to run from no matter what database the view is being called from.
Here is the SQL script:
SELECT
cus.MacolaCustNo,
dsp.cmp_code ,
count(distinct dsp.item_no) AS InventoryOnDisplay,
(SELECT max(dsp.LastSynchronizationDate)
FROM Hinkley.dbo.vw_HH_next_Capture_date ) AS UpdatedDate,
case
WHEN DATEADD(DAY, 90, isnull(max(dsp.LastSynchronizationDate),'1/1/1900')) >=
(SELECT max(dsp.LastSynchronizationDate)
FROM Hinkley.dbo.vw_HH_next_Capture_date )
THEN 'Compliant'
WHEN DATEADD(DAY, 90, isnull(max(dsp.LastSynchronizationDate),'1/1/1900')) <=
(SELECT max(dsp.LastSynchronizationDate)
FROM Hinkley.dbo.vw_HH_next_Capture_date )
AND DATEADD(DAY, 90, isnull(max(dsp.LastSynchronizationDate),'1/1/1900')) >= getdate()
THEN 'Warning'
ELSE 'Non-Compliant'
END AS Inventory_Status
FROM
Hinkley.dbo.HLIINVDSP_SQL dsp (nolock)
INNER JOIN
[DATA].dbo.vw_HLI_Customer (nolock) cus
ON cus.CusNo = dsp.cmp_code
WHERE
cus.cust_showroom = 1
AND
cus.active_y = 1
GROUP BY cus.MacolaCustNo,dsp.cmp_code
The task is: I have an application that is similar to a time card if you will. However, any employee may have 1 or more claim entries that overlap with another. The aggregation is currently being done in VB.NET, however there are huge performance issues this way. So, my object here is to use T-SQL if possible to do this for me. Hopefully this makes sense. Each claim entry will have a notes field that should be combined if the entries overlap. So, it works something like this:
Claim-1: ClaimID-123 Start-"9:00" End-"10:00" Notes-"Testing 1"
Claim-2: ClaimID-456 Start-"9:30" End-"10:30" Notes-"Testing 2"
Desired Result: Start-"9:00", End-"10:30", concatenating the notes column to include notes from both claim entries.
SQL Code Start
SELECT s1.StartTime,
MIN(t1.EndTime) As EndTime
FROM vw_ClaimLine s1
INNER JOIN vw_ClaimLine t1 ON s1.StartTime <= t1.EndTime
AND NOT EXISTS(SELECT * FROM vw_ClaimLine t2
WHERE t1.EndTime >= t2.StartTime AND t1.EndTime < t2.EndTime)
WHERE NOT EXISTS(SELECT * FROM vw_ClaimLine s2
WHERE s1.StartTime > s2.StartTime AND s1.StartTime <= s2.EndTime)
AND
s1.RecDate BETWEEN '4-01-2018' AND '4-1-2018' AND s1.ProvidedBy = 233
GROUP BY s1.StartTime
I have below select query which is long running and i want to view the query execution plan to understand why this query is long running and which statement in sql query affecting the query performance. I am using oracle sql developer and i checked the explain plan for the below query but did not understood it clearly which statement is effecting my query so as to optimize my query.
Select *
from PROVISIONING_LOG#FONIC_RETAIL PL
JOIN PROVISIONING_TASK#FONIC_RETAIL PT ON PL.PROVISIONING_TASK_ID = PT.ID JOIN SERVICE#FONIC_RETAIL SER ON PT.SERVICE_ID = SER.ID
JOIN TEMP_WF_DEF_ALL TT ON SER.SUBSCRIPTION_ID = TT.SUBSCRIPTION_ID
where PT.CODE='MIGOPT_PACK' and PT.DESCRIPTION Like '%CVB Request' AND PT.PARAMETERS LIKE '%OPERATION=ADD%' AND PL.RESPONSE_TYPE IS NULL AND PL.REQUEST IS NOT NULL
and ((to_char(PT.START_DATE,'YYYYMMDDHH24Mi') = to_char(TT.COMPLETE_DATE,'YYYYMMDDHH24Mi'))
or (to_char(PT.START_DATE,'YYYYMMDDHH24Mi') = to_char(TT.COMPLETE_DATE + 1/1440,'YYYYMMDDHH24Mi'))) AND
PL.TIME_STAMP < SYSDATE - numtodsinterval ( 30,'MINUTE' )
and PL.TIME_STAMP > SYSDATE - numtodsinterval ( 4,'HOUR' )
AND TT.START_DATE < SYSDATE - numtodsinterval ( 30,'MINUTE' )
and TT.START_DATE > SYSDATE - numtodsinterval ( 4,'HOUR' )
AND TT.WF_NAME IN
('Subscribe LIDL Community Flat',
'LDLMonatsFlatrate Subscribe');
Query execution plan for the above query:
As you are using a mix of local tables and remote tables. If the tables on the remote database are larger than the ones on the local database then you might need to use the DRIVING_SITE hint so the smaller of the set of tables are moved to the database issuing the call.
DRIVING_SITE
I have some questions about my query. I call this store-procedure in my first page, so it is important for me if it is optimize enough.
I do some select with some basic where expression, Then I filter them with some expression I passed through this store-procedure.
It is also considerable for me to select top n and its gonna search through millions of items (but I have hundreds of items already) and then do some paging in my website.
Select top (#NumberOfRows)
...
from(
SELECT
row_number() OVER (ORDER BY tblEventOpen.TicketAt, tblEvent.EventName, tblEventDetail.TimeStart) as RowNumber
, ...
FROM --[...some inner join logic...]
WHERE
(tblEventOpen.isValid = 1) AND (tblEvent.isValid = 1) and
(tblCondition_ResellerDetail.ResellerID = 1) AND
(tblEventOpen.TicketAt >= GETDATE()) AND
(GETDATE() BETWEEN
DATEADD(minute, (tblEventDetail.TimeStart - 60 * tblCondition_ResellerDetail.StartTime) , tblEventOpen.TicketAt)
AND DATEADD(minute, (tblEventDetail.TimeStart - 60 * tblCondition_ResellerDetail.EndTime) , tblEventOpen.TicketAt))
) as t1
where RowNumber >= (#PageNumber -1) * #NumberOfRows and
(#city='' or #city is null or city like #city) and
(#At is null or #At=At) and
(#TimeStartInMinute=-1 or #TimeStartInMinute=TimeStartInMinute) and
(#EventName='' or EventName like #EventName) and
(#CategoryID=-1 or #CategoryID = CategoryID) and
(#EventID is null or #EventID = EventID) and
(#DetailID is null or #DetailID = DetailID)
ORDER BY RowNumber
I'm worry about this part:
(GETDATE() BETWEEN
DATEADD(minute, (tblEventDetail.TimeStart - 60 * tblCondition_ResellerDetail.StartTime) , tblEventOpen.TicketAt)
AND DATEADD(minute, (tblEventDetail.TimeStart - 60 * tblCondition_ResellerDetail.EndTime) , tblEventOpen.TicketAt))
How does table t1 execute? I mean after I put some where expression after t1 (line 17 and further), does it filter items after execution of t1? for example I filter result by rownumber of 10, so it mean the inner (...) as t1 select will only return 10 items, or it select all items then my outer select will take 10 of them?
I want to filter my result by some optional parameters, so I put something like #DetailID is null or #DetailID = DetailID, is it a good way?
Anything else should I consider to make it faster (more optimize)?
My comment on your query:
You're correct, you should worry about condition "GETDATE() BETWEEN ...". Comparing value with function involving more than 1 table will most likely scan entire search space. Simplify your condition or if possible add a computed column for such function
Put all conditions except "RowNumber >= ..." in inner query
Its okay to put optional condition the way you do. I do it too :-)
Make sure you have index at least one for each column employed in the where clause as the first column of the index, and then the primary key. It would be better if your primary key is clustered
Well, these are based on my own experience. It may or may be not applicable to your situation.
[UPDATE] Here's the complete query
Select top (#NumberOfRows)
...
from(
SELECT
row_number() OVER (ORDER BY tblEventOpen.TicketAt, tblEvent.EventName, tblEventDetail.TimeStart) as RowNumber
, ...
FROM --[...some inner join logic...]
WHERE
(tblEventOpen.isValid = 1) AND (tblEvent.isValid = 1) and
(tblCondition_ResellerDetail.ResellerID = 1) AND
(tblEventOpen.TicketAt >= GETDATE()) AND
(GETDATE() BETWEEN
DATEADD(minute, (tblEventDetail.TimeStart - 60 * tblCondition_ResellerDetail.StartTime) , tblEventOpen.TicketAt)
AND DATEADD(minute, (tblEventDetail.TimeStart - 60 * tblCondition_ResellerDetail.EndTime) , tblEventOpen.TicketAt)) and
(#city='' or #city is null or city like #city) and
(#At is null or #At=At) and
(#TimeStartInMinute=-1 or #TimeStartInMinute=TimeStartInMinute) and
(#EventName='' or EventName like #EventName) and
(#CategoryID=-1 or #CategoryID = CategoryID) and
(#EventID is null or #EventID = EventID) and
(#DetailID is null or #DetailID = DetailID)
) as t1
where RowNumber >= (#PageNumber -1) * #NumberOfRows
ORDER BY RowNumber
Whilst you can seek advice on your query, it is better to learn how to optimise it yourself.
You need to view the execution plan, identify the bottlenecks and then see if there is anything that can be done to make an improvement.
In SSMS you can click "Query" ---> "Include Actual Execution Plan" before you run your query. (Ctrl+M) is they keyboard shortcut.
Then execute your query. SSMS will create a new tab in the results pane. Which will show you how the SQL engine executes your query, you can hover over each node for more information. The cost % will be particularly interesting, allowing you to see the most expensive part of your query.
It's difficult to advise you any more without that execution plan, which is why a number of people commented on your question. Your schema and indexes change how the query is executed, so it's not something that someone can accuratly replicate in their own environment without scripts for tables / indexes etc.... Even then statistics could be out of date and other problems could arise.
You can also execute SET STATISTICS PROFILE ON to get a textual view of the plan (maybe useful to seek help).
There are a number of articles that can help you fix the bottlenecks, or post another question for more advice.
http://msdn.microsoft.com/en-us/library/ms178071.aspx
SQL Server Query Plan Analysis
Execution Plan Basics