I'm using an SQL Server 2012 and SET STATISTICS TIME ON to measure the CPU-time for my sql statements. I use this because i only want to get the time the database needs to execute the statement.
When returning large data from a select, i noticed the CPU-time going up pretty high, like using TOP 2000 will need about 400ms, but without it will need about 10000ms CPU-time.
What i'm not sure about is:
Is it possible that the CPU-time i get returned includes something like the time it needs to display the millions of rows returned in my Sql Server Management Studio? That would be pretty much of a bad situation.
Update:
The time i want to recieve is the execution time of the sql server without the time needed for the ssms to display the rows. There are several time statistics display in the Client statistics , but after searching for a long time it's really hard to find good references explaining what they are. Any suggestions?
Idea: elapsed time(sql server execution time) - client processing time (Client statistics)
Maybe this is an option?
In a multi-threaded world, CPU time is increasingly less helpful for simple tuning. Execution time is worth looking at.
To see if execution time (elapsed time) spent on displaying results is included you could SELECT TOP 2000 * INTO #temp to compare execution times.
Update:
My quick tests suggest the overhead of creating/inserting into a #temp table outweighs that of displaying results (at 5000). When I go to 50,000 results the SELECT INTO runs more quickly. The counts at which the two become equivalent depends on how many and what type of fields are returned. I tested with:
SET STATISTICS TIME ON
SELECT TOP 50000 NEWID()
FROM master..spt_values v1, master..spt_values v2
WHERE v1.number > 100
SET STATISTICS TIME OFF
-- CPU time = 32 ms, elapsed time = 121 ms.
SET STATISTICS TIME ON
SELECT TOP 50000 NEWID() col1
INTO #test
FROM master..spt_values v1, master..spt_values v2
WHERE v1.number > 100
SET STATISTICS TIME OFF
-- CPU time = 15 ms, elapsed time = 87 ms.
CPU time in SET STATISTICS TIME ON only counts the time that SQL Server needs to execute the query. It doesn't include any time the client takes to render the results. It also excludes any time SQL Server spends waiting for buffers to clear. In short, it really is pretty independent of the client.
Related
I have a query in SQL Server 2019 that does a SELECT on the primary key fields of a table. This table has about 6 million rows of data in it. I want to know exactly how fast my query is down to the microsecond (or at least the 100 microsecond). My query is faster than a millisecond, but all I can find in SQL server is query measurements accurate to the millisecond.
What I've tried so far:
SET STATISTICS TIME ON
This only shows milliseconds
Wrapping my query like so:
SELECT #Start=SYSDATETIME()
SELECT TOP 1 b.COL_NAME FROM BLAH b WHERE b.key = 0x1234
SELECT #End=SYSDATETIME();
SELECT DATEDIFF(MICROSECOND,#Start,#End)
This shows that no time has elapsed at all. But this isn't accurate because if I add WAITFOR DELAY '00:00:00.001', which should add a measurable millisecond of delay, it still shows 0 for the datediff. Only if I wat for 2 milliseconds do I see it show up in the datediff
Looking up the execution plan and getting the total_worker_time from the sys.dm_exec_query_stats table.
Here I see 600 microseconds, however the microsoft docs seem to indicate that this number cannot be trusted:
total_worker_time ... Total amount of CPU time, reported in microseconds (but only accurate to milliseconds)
I've run out of ideas and could use some help. Does anyone know how I can accurately measure my query in microseconds? Would extended events help here? Is there another performance monitoring tool I could use? Thank you.
This is too long for a comment.
In general, you don't look for performance measurements measured in microseconds. There is just too much variation, based on what else is happening in the database, on the server, and in the network.
Instead, you set up a loop and run the query thousands -- or even millions -- of times and then average over the executions. There are further nuances, such as clearing caches if you want to be sure that the query is using cold caches.
I created a new Google cloud project and set up BigQuery data base. I tried different queries, they all are executing too long. Currently we don't have a lot of data, so high performance was expected.
Below are some examples of queries and their execution time.
Query #1 (Job Id bquxjob_11022e81_172cd2d59ba):
select date(installtime) regtime
,count(distinct userclientid) users
,sum(fm.advcost) advspent
from DWH.DimUser du
join DWH.FactMarketingSpent fm on fm.date = date(du.installtime)
group by 1
The query failed in 1 hour + with error "Query exceeded resource limits. 14521.457814668494 CPU seconds were used, and this query must use less than 12800.0 CPU seconds."
Query execution plan: https://prnt.sc/t30bkz
Query #2 (Job Id bquxjob_41f963ae_172cd41083f):
select fd.date
,sum(fd.revenue) adrevenue
,sum(fm.advcost) advspent
from DWH.FactAdRevenue fd
join DWH.FactMarketingSpent fm on fm.date = fd.date
group by 1
Execution time ook 59.3 sec, 7.7 MB processed. What is too slow.
Query Execution plan: https://prnt.sc/t309t4
Query #3 (Job Id bquxjob_3b19482d_172cd31f629)
select date(installtime) regtime
,count(distinct userclientid) users
from DWH.DimUser du
group by 1
Execution time 5.0 sec elapsed, 42.3 MB processed. What is not terrible but must be faster for such small volumes of data.
Tables used :
DimUser - Table size 870.71 MB, Number of rows 2,771,379
FactAdRevenue - Table size 6.98 MB, Number of rows 53,816
FaceMarketingSpent - Table size 68.57 MB, Number of rows 453,600
The question is what am I doing wrong so that query execution time is so big? If everything is ok, I would be glad to hear any advice on how to reduce execution time for such simple queries. If anyone from google reads my question, I would appreciate if jobids are checked.
Thank you!
P.s. Previously I had experience using BigQuery for other projects and the performance and execution time were incredibly good for tables of 50+ TB size.
Posting same reply i've given in the gcp slack workspace:
Both your first two queries looks like you have one particular worker who is overloaded. Can see this because in the compute section, the max time is very different from the avg time. This could be for a number of reasons, but i can see that you are joining a table of 700k+ rows (looking at the 2nd input) to a table of ~50k (looking at the first input). This is not good practice, you should switch it so the larger table is the left most table. see https://cloud.google.com/bigquery/docs/best-practices-performance-compute?hl=en_US#optimize_your_join_patterns
You may also have a heavily skew in your join keys (e.g. 90% of rows are on 1/1/2020, or NULL). check this.
For the third query, that time is expected, try a approx count instead to speed it up. Also note BQ starts to get better if you perform the same query over and over, so this will get quicker.
I've got a SQL query that normally runs for about .5 seconds and now, as our server gets slightly more busy, it is timing out. It involves a join on a table that is getting updated every couple seconds as a matter of course.
That is, the EmailDetails table gets updated from a mail service to notify me of reads,clicks, etc and that table gets updated every couple seconds as those notifications come through. I'm wondering if the join below is timing out because those updates are locking the table and blocking my SQL join.
If so, any suggestions how to avoid my timeout would be appreciated.
SELECT TOP 25
dbo.EmailDetails.Id,
dbo.EmailDetails.AttendeesId,
dbo.EmailDetails.Subject,
dbo.EmailDetails.BodyText,
dbo.EmailDetails.EmailFrom,
dbo.EmailDetails.EmailTo,
dbo.EmailDetails.EmailDetailsGuid,
dbo.Attendees.UserFirstName,
dbo.Attendees.UserLastName,
dbo.Attendees.OptInTechJobKeyWords,
dbo.Attendees.UserZipCode,
dbo.EmailDetails.EmailDetailsTopicId,
dbo.EmailDetails.EmailSendStatus,
dbo.EmailDetails.TextTo
FROM
dbo.EmailDetails
LEFT OUTER JOIN
dbo.Attendees ON (dbo.EmailDetails.AttendeesId = dbo.Attendees.Id)
WHERE
(dbo.EmailDetails.EmailSendStatus = 'NEEDTOSEND' OR
dbo.EmailDetails.EmailSendStatus = 'NEEDTOTEXT')
AND
dbo.EmailDetails.EmailDetailsTopicId IS NOT NULL
ORDER BY
dbo.EmailDetails.EmailSendPriority,
dbo.EmailDetails.Id DESC
Adding Execution Plan:
https://dl.dropbox.com/s/bo6atz8bqv68t0i/emaildetails.sqlplan?dl=0
It takes .5 seconds on my fast macbook but on my real server with magnetic media it takes 8 seconds.
I am trying to diagnose slow application performance on a client site.
A log file on the client machine tells me execution time for each query measured From the application side. It appears that many bare-bones simple queries to the remote DB are taking an exorbitant amount of time to complete. For example,
SELECT CONVERT(varchar, GETDATE(), 121)
This query is repeatedly taking over 5 seconds to execute as timed from the application. Other queries almost as simple (inserting one recordset into one table) are taking over a minute to complete. On my test system (with a copy of the client's database) I do not experience any of these problems.
I would suspect a slow network, except that the problem reliably disappears after running a report from Crystal Reports. Then after 1-2 hours the application slows down again.
For the sake of isolating the problem further, I would like to retrieve/log the execution time on the server side. I am trying to figure out what the best way of doing this is. I could use a variable to obtain the execution time for a single query, but I don't have the option of modifying every single query in my application.
sys.dm_exec_query_stats looked very promising for retrieving execution times for previous queries, but the millisecond values it reports for last_elapsed_time seem far too high.
Can anyone help me figure out how to obtain timing for my queries?
Here it is A way
set statistics time on
SELECT CONVERT(varchar, GETDATE(), 121)
set statistics time off
And it will report the time spend for the query as
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
I am trying to speed up a long running query that I have (takes about 10 minutes to run...). In order to track down what part of the query is costing me the most time I included the Actual Execution Plan when I ran it and found a particular section that was taking up 55% (screen shot below)
alt text http://img109.imageshack.us/img109/9571/53218794.png
This didn't quite seem right to me so I added Print '1' and Print '2' before and after this trouble section. When I run the query for a mere 17 seconds and then cancel it the 1 and 2 print out which I'm assuming means it's getting through that section in the first 17 seconds.
alt text http://img297.imageshack.us/img297/4739/66797633.png
Am I doing something wrong here or is my Execution plan misleading me?
Metrics from perfmon will also help figure out what's going wrong... you could be running into some serious IO issues with the drive your tempDB is residing on. Additionally, run a trace and look at the CPU & IO of the actual run.
Good perfmon metrics to look at are disk queue length (avg & writes).
If you don't have access to perfmon or don't want to trace things, use "SET STATISTICS IO ON" at the beginning of your query and allow it to complete...don't stop it. Just because an execution plan says it's taking over have the cost doesn't mean it will run for half of the query time...it could be much more (or less).
It says Query 10: Query cost (relative to the batch): 55%. Are 100% positive that it is the 10th statement in the batch that you surounded with Print statements? Could the INSERT ... INTO #mpProgramSet2 execute multiple times, some times in under 17 seconds other time for 5 minutes, depending on how much data was selected/inserted?
As a side note you should run with SET STATISTICS TIME ON rather that prints, this will give you exact compile/time and execution time of each statement in the batch.
I wouldn't trust that printing the '1' and '2' will prove anything about what has executed and what has not. I do the same thing, but I just wouldn't rely on it as proof. You could print the ##rowcount from that first insert query - that would indicate for sure that the insert has occurred.
Although the plan says that query may take 55% of the cost, it may not be 55% of the execution time, especially if the query results are cached.
Another advantage of printing the ##rowcount is to compare the actual number of rows to the estimated rows (51K). If they differ by a lot then you might investigate the statistics for your indexes.
We would need the full query to understand what's going on; but I would probably start with setting MAXDOP to 1 in order to limit the number of processors it's running on.
Note that sometimes queries need to be limited to only 1 processor due to locks etc.
Further you might try adding NOLOCKs to any of your selects which can get away with dirty reads.