TSQL: Different execution due to parallelism - sql-server-2012

One of my query is performing slow for small records but for large records its working as expected.
Foe example:
Query #1
SELECT *
FROM vwServwerHealthReport
WHERE startDate >= '20/02/2018'
AND startDate <= '06/03/2018'
-- fetching more than 6000 records in 0:02 seconds
Query #2
SELECT *
FROM vwServwerHealthReport
WHERE startDate >= '02/03/2018'
AND startDate <= '06/03/2018'
-- fetching approx 800 records in 02:05 seconds
I have checked execution plans as well and found that the execution plans of both the queries is different and the slow running query is using parallelism operator a lot but the fast query is not using any parallelism operation.
Initially I was assuming this to be a parameter sniffing problem and implemented "Recompile, Optimize for" hints but have not got the improvement.
Please let me know why parallelism is being used by slow running query even for small records and how can i resolve this issue.

Related

Oracle SQL query performance change when changing date predicate by one day

I have this very complex sql query with many joins that runs in a few seconds with one date in the predicate, but when I change the date by one day, I end up cancelling the query after 15 minutes. PHA is the PO.PO_HEADERS_ALL table in Oracle, and the CREATION_DATE column is defined as type DATE.
--this finishes in a few seconds with 444 records
and pha.creation_date > to_date('23-JAN-2021','DD-MON-YYYY')
-- this never finishes
and pha.creation_date > to_date('22-JAN-2021','DD-MON-YYYY')
If use TRUNC(pha.creation_date) in the predicate, then both queries finish in a few seconds, with expected results.
Just wondering if someone can explain why one day would cause a difference? The explain plans in TOAD did not really look any different between the two queries.

What can cause SQL equery to run significantly longer time on different Oracle machines same SW and HW

I am using PL SQL procedure to aggregate data from 4 servers to one place and its has significantly lower performance on one of the servers.
I have this relatively simple query to aggregate data from the audit log:
Indexes are ok, volume of rows is bit bigger on the fourth server but not to cause this difference, same hardware, same linux, same oracle version (Oracle 11.2.0.4.0), same structure of the table and indexes. Indexes are usable, same partitioning.
INSERT INTO app_stats_agg_hourly (SHIPMENTS,event_datetime,COUNTRY,data_type,event_type,collection_time,INSTANCE)
SELECT COUNT(*) as SHIPMENTS, TRUNC(event_datetime,'HH24') as event_datetime,COUNTRY,data_type,event_type,load_run_start,v_region
FROM APP_AUDIT
WHERE
event_type in ('FromApp','ToError','ToApp','Generate','FromCorr')
and event_datetime > last_collection
and event_datetime <= last_event
GROUP BY TRUNC(event_datetime,'HH24'),COUNTRY,data_type,event_type;
It works well and exactly as expected. The last_collection and last_event variables are initialized before in the procedure.
This code is running for a minute or two on 3 of 4 servers and half an hour on the fourth.
After a long investigation I found that the variables I am using are TIMESTAMPS and event_datetime column is date. If I use date it works like a charm. My question is - what is causing this to work different?

Why might a CASE expression be more efficient than a WHERE clause?

Can anyone suggest why my particular query on a table with hundreds of millions of rows works more efficiently when I put what would typically be the WHERE clause into a CASE expression?
Logically you do:
SELECT
NOTES_TABLE.CASE_ID,
NOTES_TABLE.CASE_NOTE,
NOTES_TABLE.ACTIVITY_NAME,
NOTES_TABLE.CREATED_DATETIME
FROM
NOTES_TABLE
WHERE
NOTES_TABLE.ACTIVITY_NAME = 'Some Activity'
AND NOTES_TABLE.CREATED_DATETIME > '01 Jan 2015 00:00:00'
AND NOTES_TABLE.CASE_NOTE='Some Activity has been processed'
However, I've noticed that:
select t1.* FROM
(
SELECT
NOTES_TABLE.CASE_ID,
case
when NOTES_TABLE.CASE_NOTE='Some Activity has been processed'
then NOTES_TABLE.CASE_NOTE else null
end as NOTES_TABLEs,
NOTES_TABLE.ACTIVITY_NAME,
NOTES_TABLE.CREATED_DATETIME
FROM
NOTES_TABLE
WHERE
NOTES_TABLE.ACTIVITY_NAME = 'Some Activity'
AND NOTES_TABLE.CREATED_DATETIME > '01 Jan 2015 00:00:00'
) t1
WHERE t1.NOTES_TABLEs is not null
runs in a matter of seconds rather than 30+ minutes.
This question has bugged me for a while, but unfortunately I don't have direct database access (using Infoview front-end) so I cannot get an explain plan. My curiosity is not really a suitable reason for a service request to get my supplier to explain.
This is too long for a comment.
The difference in performance is, no doubt, due to using an index versus a full table scan. Oracle generally has a good optimizer, so it is surprising that it would miss such an optimization opportunity.
30+ minutes for a query that uses only one table is an inordinately long time. This suggests that your table has many billions of rows and you have a pretty slow computer. How large is the table? How slow is your processor?
The other possibility is that other operations were taking place on the server, such as a checkpoint or data modifications on the table.

Append data first then group or group first the append

I have two tables, with the exact same format. Since each table has the date column(the date used to create the table), group first or append first will not make any difference to the result.
I use two queries to test:
SELECT * FROM
(SELECT
TXN,CONT,ReportingDate,sum(AMT) AS TOT
FROM Table1
GROUP BY TXN,CONT,ReportingDate
UNION ALL
SELECT
TXN,CONT,ReportingDate,sum(AMT) AS TOT
FROM Table2
GROUP BY TXN,CONT,ReportingDate)
TEST
SELECT TXN, CONT,Reportingdate,sum(AMT)
from
(
SELECT
TXN,CONT,AMT,ReportingDate
FROM Table1
UNION ALL
SELECT
TXN,CONT,AMT,ReportingDate
FROM Table2
)
test
GROUP BY
TXN,CONT,Reportingdate
(22596 row(s) affected)
SQL Server Execution Times:
CPU time = 156 ms, elapsed time = 2582 ms.
(22596 row(s) affected)
SQL Server Execution Times:
CPU time = 125 ms, elapsed time = 2337 ms.
The statistics do not show a lot of difference. The timings change a few every time I run the queries.
The Execution plan
Which one will be faster? I just list one result here. I run these two queries for 10 times, 7 out of which show query 1 is faster.
The reportingdate column will be totally different in the two tables, so there will be no duplicate result for query 1. For example, the reportingdate in table 1 is a column of 10/28/2015s, and the reportingdate in table 2 are 10/29/2015s.
Thanks
Typically when decided which version of a SQL statement I want to use I consider the following:
Will they both return the same results? As mentioned by Gordon in the comment, conceptually the first would return a row duplicated in both tables as separate rows whereas the second would group them together and you would see the sum of both of them.
Performance difference. Not much performance difference here, but the second one does seem to be faster (which makes sense as the DBMS is able to get all the rows and then sum once rather than get some rows, sum, then get some more rows, and sum)
Readability/maintainability. In your opinion, when someone is debugging this later on, would they rather test the inner statements with or without a grouping statement? Really your call on this one.

Understanding why SQL query is taking so long

I have a fairly large SQL query written. Below is a simplification of the issue i am seeing.
SELECT *
FROM dbo.MyTransactionDetails TDTL
JOIN dbo.MyTransactions TRANS
on TDTL.ID = TRANS.ID
JOIN dbo.Customer CUST
on TRANS.CustID = CUST.CustID
WHERE TDTL.DetailPostTime > CONVERT(datetime, '2015-05-04 10:25:53', 120)
AND TDTL.DetailPostTime < CONVERT(datetime, '2015-05-04 19:25:53', 120)
The MyTransactionDetails contains about 7 million rows and MyTransactions has about 300k rows.
The above query takes about 10 minutes to run which is insane. All indexes have been reindexed and there is an index on all the ID columns.
Now if i add the below lines to the WHERE clause the query the query takes about 1 second.
AND TRANS.TransBeginTime > CONVERT(datetime, '2015-05-05 10:25:53', 120)
AND TRANS.TransBeginTime < CONVERT(datetime, '2015-05-04 19:25:53', 120)
I know the contents of the database and the TransBeginTime is almost identical to the DetailPostTime so these extra where clauses shouldnt filter much more then the JOIN.
Why is the addition of these so much faster?
The problem is that i cannot use the filter on TransBeginTime as it is not guaranteed that the transaction detail will be posted on the same date.
EDIT: I should also add that the execution plan says that 50% of the time is taken up by MyTransactionDetails
The percentages shown in the plan (both estimated and actual) are estimates that are based on the assumption that estimated row counts are correct. On bad cases the percentages can be totally wrong, even so that 1% can actually be 95%.
To figure out what is actually happening, turn on "statistics io". That will tell you the logical I/O count per table -- and getting that down usually means that also the time goes down.
You can also look at the actual plan, and there's a lot of things that can cause slowness, like scans, sorts, key lookups, spools etc. If you include both statistics I/O and execution plan (preferably the actual xml, not just the picture) it is a lot easier to figure out what's going wrong.