Get execution time of PostgreSQL query - sql

DECLARE #StartTime datetime,#EndTime datetime
SELECT #StartTime=GETDATE()
select distinct born_on.name
from born_on,died_on
where (FLOOR(('2012-01-30'-born_on.DOB)/365.25) <= (
select max(FLOOR((died_on.DOD - born_on.DOB)/365.25))
from died_on, born_on
where (died_on.name=born_on.name))
)
and (born_on.name <> All(select name from died_on))
SELECT #EndTime=GETDATE()
SELECT DATEDIFF(ms,#StartTime,#EndTime) AS [Duration in millisecs]
I am unable to get the query time. Instead I get the following error:
sql:/home/an/Desktop/dbms/query.sql:9: ERROR: syntax error at or near "#"
LINE 1: DECLARE #StartTime datetime,#EndTime datetime

If you mean in psql, rather than some program you are writing, use \? for the help, and see:
\timing [on|off] toggle timing of commands (currently off)
And then you get output like:
# \timing on
Timing is on.
# select 1234;
?column?
----------
1234
(1 row)
Time: 0.203 ms

There are various ways to measure execution time, each has pros and cons. But whatever you do, some degree of the observer effect applies. I.e., measuring itself may distort the result.
1. EXPLAIN ANALYZE
You can prepend EXPLAIN ANALYZE, which reports the whole query plan with estimated costs actually measured times. The query is actually executed (with all side -effect, if any!). Works for all DDL commands and some others. See:
EXPLAIN ANALYZE not working with ALTER TABLE
To check whether my adapted version of your query is, in fact, faster:
EXPLAIN ANALYZE
SELECT DISTINCT born_on.name
FROM born_on b
WHERE date '2012-01-30' - b.dob <= (
SELECT max(d1.dod - b1.dob)
FROM born_on b1
JOIN died_on d1 USING (name) -- name must be unique!
)
AND NOT EXISTS (
SELECT FROM died_on d2
WHERE d2.name = b.name
);
Execute a couple of times to get more comparable times with warm cache. Several options are available to adjust the level of detail.
While mainly interested in total execution time, make it:
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
Mostly, TIMING matters - the manual:
TIMING
Include actual startup time and time spent in each node in the output.
The overhead of repeatedly reading the system clock can slow down the
query significantly on some systems, so it may be useful to set this
parameter to FALSE when only actual row counts, and not exact times,
are needed. Run time of the entire statement is always measured, even
when node-level timing is turned off with this option. [...]
EXPLAIN ANALYZE measures on the server, using server time from the server OS, excluding network latency. But EXPLAIN adds some overhead to also output the query plan.
2. psql with \timing
Or use \timing in psql. Like Peter demonstrates.
The manual:
\timing [ on | off ]
With a parameter, turns displaying of how long each SQL statement
takes on or off. Without a parameter, toggles the display between on
and off. The display is in milliseconds; intervals longer than 1
second are also shown in minutes:seconds format, with hours and days
fields added if needed.
Important difference: psql measures on the client using local time from the local OS, so the time includes network latency. This can be a negligible difference or huge depending on connection and volume of returned data.
3. Enable log_duration
This has probably the least overhead per measurement and produces the least distorted timings. But it's a little heavy-handed as you have to be superuser, have to adjust the server configuration, cannot just target the execution of a single query, and you have to read the server logs (unless you redirect to stdout).
The manual:
log_duration (boolean)
Causes the duration of every completed statement to be logged. The
default is off. Only superusers can change this setting.
For clients using extended query protocol, durations of the Parse,
Bind, and Execute steps are logged independently.
There are related settings like log_min_duration_statement.
4. Precise manual measurement with clock_timestamp()
The manual:
clock_timestamp() returns the actual current time, and therefore its value changes even within a single SQL command.
filiprem provided a great way to get execution times for ad-hoc queries as exact as possible. On modern hardware, timing overhead should be insignificant but depending on the host OS it can vary wildly. Find out with the server application pg_test_timing.
Else you can mostly filter the overhead like this:
DO
$do$
DECLARE
_timing1 timestamptz;
_start_ts timestamptz;
_end_ts timestamptz;
_overhead numeric; -- in ms
_timing numeric; -- in ms
BEGIN
_timing1 := clock_timestamp();
_start_ts := clock_timestamp();
_end_ts := clock_timestamp();
-- take minimum duration as conservative estimate
_overhead := 1000 * extract(epoch FROM LEAST(_start_ts - _timing1
, _end_ts - _start_ts));
_start_ts := clock_timestamp();
PERFORM 1; -- your query here, replacing the outer SELECT with PERFORM
_end_ts := clock_timestamp();
-- RAISE NOTICE 'Timing overhead in ms = %', _overhead;
RAISE NOTICE 'Execution time in ms = %' , 1000 * (extract(epoch FROM _end_ts - _start_ts)) - _overhead;
END
$do$;
Take the time repeatedly (doing the bare minimum with 3 timestamps here) and pick the minimum interval as conservative estimate for timing overhead. Also, executing the function clock_timestamp() a couple of times should warm it up (in case that matters for your OS).
After measuring the execution time of the payload query, subtract that estimated overhead to get closer to the actual time.
Of course, it's more meaningful for cheap queries to loop 100000 times or execute it on a table with 100000 rows if you can, to make distracting noise insignificant.

PostgreSQL is not Transact-SQL. These are two slightly different things.
In PostgreSQL, this would be something along the lines of
DO $proc$
DECLARE
StartTime timestamptz;
EndTime timestamptz;
Delta double precision;
BEGIN
StartTime := clock_timestamp();
PERFORM foo FROM bar; /* Put your query here, replacing SELECT with PERFORM */
EndTime := clock_timestamp();
Delta := 1000 * ( extract(epoch from EndTime) - extract(epoch from StartTime) );
RAISE NOTICE 'Duration in millisecs=%', Delta;
END;
$proc$;
On the other hand, measuring query time does not have to be this complicated. There are better ways:
In postgres command line client there is a \timing feature which measures query time on client side (similar to duration in bottomright corner of SQL Server Management Studio).
It's possible to record query duration in server log (for every query, or only when it lasted longer than X milliseconds).
It's possible to collect server-side timing for any single statement using the EXPLAIN command:
EXPLAIN (ANALYZE, BUFFERS) YOUR QUERY HERE;

Related

Check the execution time of a query accurate to the microsecond

I have a query in SQL Server 2019 that does a SELECT on the primary key fields of a table. This table has about 6 million rows of data in it. I want to know exactly how fast my query is down to the microsecond (or at least the 100 microsecond). My query is faster than a millisecond, but all I can find in SQL server is query measurements accurate to the millisecond.
What I've tried so far:
SET STATISTICS TIME ON
This only shows milliseconds
Wrapping my query like so:
SELECT #Start=SYSDATETIME()
SELECT TOP 1 b.COL_NAME FROM BLAH b WHERE b.key = 0x1234
SELECT #End=SYSDATETIME();
SELECT DATEDIFF(MICROSECOND,#Start,#End)
This shows that no time has elapsed at all. But this isn't accurate because if I add WAITFOR DELAY '00:00:00.001', which should add a measurable millisecond of delay, it still shows 0 for the datediff. Only if I wat for 2 milliseconds do I see it show up in the datediff
Looking up the execution plan and getting the total_worker_time from the sys.dm_exec_query_stats table.
Here I see 600 microseconds, however the microsoft docs seem to indicate that this number cannot be trusted:
total_worker_time ... Total amount of CPU time, reported in microseconds (but only accurate to milliseconds)
I've run out of ideas and could use some help. Does anyone know how I can accurately measure my query in microseconds? Would extended events help here? Is there another performance monitoring tool I could use? Thank you.
This is too long for a comment.
In general, you don't look for performance measurements measured in microseconds. There is just too much variation, based on what else is happening in the database, on the server, and in the network.
Instead, you set up a loop and run the query thousands -- or even millions -- of times and then average over the executions. There are further nuances, such as clearing caches if you want to be sure that the query is using cold caches.

Finding Execution time of query using SQL Developer

I am beginner with Oracle DB.
I want to know execution time for a query. This query returns around 20,000 records.
When I see the SQL Developer, it shows only 50 rows, maximum tweakable to 500. And using F5 upto 5000.
I would have done by making changes in the application, but application redeployment is not possible as it is running on production.
So, I am limited to using only SQL Developer. I am not sure how to get the seconds spent for execution of the query ?
Any ideas on this will help me.
Thank you.
Regards,
JE
If you scroll down past the 50 rows initially returned, it fetches more. When I want all of them, I just click on the first of the 50, then press CtrlEnd to scroll all the way to the bottom.
This will update the display of the time that was used (just above the results it will say something like "All Rows Fetched: 20000 in 3.606 seconds") giving you an accurate time for the complete query.
If your statement is part of an already deployed application and if you have rights to access the view V$SQLAREA, you could check for number of EXECUTIONS and CPU_TIME. You can search for the statement using SQL_TEXT:
SELECT CPU_TIME, EXECUTIONS
FROM V$SQLAREA
WHERE UPPER (SQL_TEXT) LIKE 'SELECT ... FROM ... %';
This is the most precise way to determine the actual run time. The view V$SESSION_LONGOPS might also be interesting for you.
If you don't have access to those views you could also use a cursor loop for running through all records, e.g.
CREATE OR REPLACE PROCEDURE speedtest AS
count number;
cursor c_cursor is
SELECT ...;
BEGIN
-- fetch start time stamp here
count := 0;
FOR rec in c_cursor
LOOP
count := count +1;
END LOOP;
-- fetch end time stamp here
END;
Depending on the architecture this might be more or less accurate, because data might need to be transmitted to the system where your SQL is running on.
You can change those limits; but you'll be using some time in the data transfer between the DB and the client, and possibly for the display; and that in turn would be affected by the number of rows pulled by each fetch. Those things affect your application as well though, so looking at the raw execution time might not tell you the whole story anyway.
To change the worksheet (F5) limit, go to Tools->Preferences->Database->Worksheet, and increase the 'Max rows to print in a script' value (and maybe 'Max lines in Script output'). To change the fetch size go to the Database->Advanced panel in the preferences; maybe to match your application's value.
This isn't perfect but if you don't want to see the actual data, just get the time it takes to run in the DB, you can wrap the query to get a single row:
select count(*) from (
<your original query
);
It will normally execute the entire original query and then count the results, which won't add anything significant to the time. (It's feasible it might rewrite the query internally I suppose, but I think that's unlikely, and you could use hints to avoid it if needed).

Understanding the Sql Server execution time

I'm using an SQL Server 2012 and SET STATISTICS TIME ON to measure the CPU-time for my sql statements. I use this because i only want to get the time the database needs to execute the statement.
When returning large data from a select, i noticed the CPU-time going up pretty high, like using TOP 2000 will need about 400ms, but without it will need about 10000ms CPU-time.
What i'm not sure about is:
Is it possible that the CPU-time i get returned includes something like the time it needs to display the millions of rows returned in my Sql Server Management Studio? That would be pretty much of a bad situation.
Update:
The time i want to recieve is the execution time of the sql server without the time needed for the ssms to display the rows. There are several time statistics display in the Client statistics , but after searching for a long time it's really hard to find good references explaining what they are. Any suggestions?
Idea: elapsed time(sql server execution time) - client processing time (Client statistics)
Maybe this is an option?
In a multi-threaded world, CPU time is increasingly less helpful for simple tuning. Execution time is worth looking at.
To see if execution time (elapsed time) spent on displaying results is included you could SELECT TOP 2000 * INTO #temp to compare execution times.
Update:
My quick tests suggest the overhead of creating/inserting into a #temp table outweighs that of displaying results (at 5000). When I go to 50,000 results the SELECT INTO runs more quickly. The counts at which the two become equivalent depends on how many and what type of fields are returned. I tested with:
SET STATISTICS TIME ON
SELECT TOP 50000 NEWID()
FROM master..spt_values v1, master..spt_values v2
WHERE v1.number > 100
SET STATISTICS TIME OFF
-- CPU time = 32 ms, elapsed time = 121 ms.
SET STATISTICS TIME ON
SELECT TOP 50000 NEWID() col1
INTO #test
FROM master..spt_values v1, master..spt_values v2
WHERE v1.number > 100
SET STATISTICS TIME OFF
-- CPU time = 15 ms, elapsed time = 87 ms.
CPU time in SET STATISTICS TIME ON only counts the time that SQL Server needs to execute the query. It doesn't include any time the client takes to render the results. It also excludes any time SQL Server spends waiting for buffers to clear. In short, it really is pretty independent of the client.

mysterious oracle query

if a query in oracle takes the first time it is executed 11 minutes, and the next time, the same query 25 seconds, with the buffer being flushed, what is the possible cause? could it be that the query is written in a bad way?
set timing on;
set echo on
set lines 999;
insert into elegrouptmp select idcll,idgrpl,0 from elegroup where idgrpl = 109999990;
insert into SLIMONTMP (idpartes, indi, grecptseqs, devs, idcll, idclrelpayl)
select rel.idpartes, rel.indi, rel.idgres,rel.iddevs,vpers.idcll,nvl(cdsptc.idcll,vpers.idcll)
from
relbqe rel,
elegrouptmp ele,
vrdlpers vpers
left join cdsptc cdsptc on
(cdsptc.idclptcl = vpers.idcll and
cdsptc.cdptcs = 'NOS')
where
rel.idtits = '10BCPGE ' and
vpers.idbqes = rel.idpartes and
vpers.cdqltptfc = 'N' and
vpers.idcll = ele.idelegrpl and
ele.idgrpl = 109999990;
alter system flush shared_pool;
alter system flush buffer_cache;
alter system flush global context;
select /* original */ mvtcta_part_SLIMONtmp.idpartes,mvtcta_part_SLIMONtmp.indi,mvtcta_part_SLIMONtmp.grecptseqs,mvtcta_part_SLIMONtmp.devs,
mvtcta_part_SLIMONtmp.idcll,mvtcta_part_SLIMONtmp.idclrelpayl,mvtcta_part_vrdlpers1.idcll,mvtcta_part_vrdlpers1.shnas,mvtcta_part_vrdlpers1.cdqltptfc,
mvtcta_part_vrdlpers1.idbqes,mvtcta_part_compte1.idcll,mvtcta_part_compte1.grecpts,mvtcta_part_compte1.seqc,mvtcta_part_compte1.devs,mvtcta_part_compte1.sldminud,
mvtcta.idcll,mvtcta.grecptseqs,mvtcta.devs,mvtcta.termel,mvtcta.dtcptl,mvtcta.nusesi,mvtcta.fiches,mvtcta.indl,mvtcta.nuecrs,mvtcta.dtexel,mvtcta.dtvall,
mvtcta.dtpayl,mvtcta.ioi,mvtcta.mtd,mvtcta.cdlibs,mvtcta.libcps,mvtcta.sldinitd,mvtcta.flagtypei,mvtcta.flagetati,mvtcta.flagwarnl,mvtcta.flagdonei,mvtcta.oriindl,
mvtcta.idportfl,mvtcta.extnuecrs
from SLIMONtmp mvtcta_part_SLIMONtmp
left join vrdlpers mvtcta_part_vrdlpers1 on
(
mvtcta_part_vrdlpers1.idbqes = mvtcta_part_SLIMONtmp.idpartes
and mvtcta_part_vrdlpers1.cdqltptfc = 'N'
and mvtcta_part_vrdlpers1.idcll = mvtcta_part_SLIMONtmp.idcll
)
left join compte mvtcta_part_compte1 on
(
mvtcta_part_compte1.idcll = mvtcta_part_vrdlpers1.idcll
and mvtcta_part_compte1.grecpts = substr (mvtcta_part_SLIMONtmp.grecptseqs, 1, 2 )
and mvtcta_part_compte1.seqc = substr (mvtcta_part_SLIMONtmp.grecptseqs, -1 )
and mvtcta_part_compte1.devs = mvtcta_part_SLIMONtmp.devs
and (mvtcta_part_compte1.devs = ' ' or ' ' = ' ')
and mvtcta_part_compte1.cdpartc not in ( 'L' , 'R' )
)
left join mvtcta mvtcta on
(
mvtcta.idcll = mvtcta_part_SLIMONtmp.idclrelpayl
and mvtcta.devs = mvtcta_part_SLIMONtmp.devs
and mvtcta.grecptseqs = mvtcta_part_SLIMONtmp.grecptseqs
and mvtcta.flagdonei <> 0
and mvtcta.devs = mvtcta_part_compte1.devs
and mvtcta.dtvall > 20101206
)
where 1=1
order by mvtcta_part_compte1.devs,
mvtcta_part_SLIMONtmp.idpartes,
mvtcta_part_SLIMONtmp.idclrelpayl,
mvtcta_part_SLIMONtmp.grecptseqs,
mvtcta.dtvall;
"if a query in oracle takes the first
time it is executed 11 minutes, and
the next time, the same query 25
seconds, with the buffer being
flushed, what is the possible cause?"
The thing is, flushing the DB Buffers, like this ...
alter system flush shared_pool
/
... wipes the Oracle data store but there are other places where data gets cached. For instance the chances are your OS caches its file reads.
EXPLAIN PLAN is good as a general guide to how the database thinks it will execute a query, but it is only a prediction. It can be thrown out by poor statistics or ambient conditions. It is not good at explaining why a specific instance of a query took as much time as it did.
So, if you really want to understand what occurs when the database executes a specific query you need to get down and dirty, and learn how to use the Wait Interface. This is a very powerful tracing mechanism, which allows us to see the individual events that happen over the course of a single query execution. Each version of Oracle has extended the utility and richness of the Wait Interface, but it has been essential to proper tuning since Oracle 9i (if not earlier).
Find out more by reading Roger Schrag's very good overview .
In your case you'll want to run the trace multiple times. In order to make it easier to compare results you should use a separate session for each execution, setting the 10046 event each time.
What else is happening on the box when you ran these? You can get different timings based on other processes chewing resources. Also, with a lot of joins, performance will depend on memory usage (hash_area_size, sort_area_size, etc) and availability, so perhaps you are paging (check temp space size/usage also). In short, try sql_trace and tkprof to analyze deeper
Sometimes a block is written to the file system before it is committed (a dirty block). When that block is read later, Oracle sees that it was uncommitted. It checks the open transaction and, if the transaction isn't still there, it knows the change was committed. Therefore it writes the block back as a clean block. It is called delayed block cleanout.
That is one possible reason why reading blocks for the first time can be slower than a subsequent re-read.
Could be the second time the execution plan is known. Maybe the optimizer has a very hard time finding a execution plan for some reason.
Try setting
alter session set optimizer_max_permutations=100;
and rerun the query. See if that makes any difference.
could it be that the query is written in a bad way?
"bad" is a rather emotional expression - but broadly speaking, yes, if a query performs significantly faster the second time it's run, it usually means there are ways to optimize the query.
Actually optimizing the query is - as APC says - rather a question of "down and dirty". Obvious candidate in your examply might be the substring - if the table is huge, and the substring misses the index, I'd imagine that would take a bit of time, and I'd imagine the result of all those substrin operations are cached somewhere.
Here's Tom Kyte's take on flushing Oracle buffers as a testing practice. Suffice it to say he's not a fan. He favors the approach of attempting to emulate your production load with your test data ("real life"), and tossing out the first and last runs. #APC's point about OS caching is Tom's point - to get rid of that (non-trivial!) effect you'd need to bounce the server, not just the database.

How does mySQL handle dynamic value within ORDER BY

It stumbled upon me while I was reading the query in another post.
Take the following query for example (ignore the non-practical use of the ordering):
SELECT
*
FROM Members
ORDER BY (TIMESTAMPDIFF(FRAC_SECOND, DateCreated , SYSDATE()))
Say "Members" table has a huge row count (or the query is complex enough for it to be executed over at least dozen of milliseconds). How does mySQL or other mainstream DB engines evaluate the "SYSDATE()" in the "ORDER BY"?
Say the query takes half a second, the microsecond (FRAC_SECOND) of "SYSDATE" changes 1000 X 1000 X 0.5 = 500 000 times.
My questions are:
Does the "SYSDATE" get fixed on the
start of the query execution or it
gets evaluated and changes as the
execution progresses?
If it's the latter, can I assume the ordering might be jumbled?
UPDATE:
My original post uses NOW as an example of dynamic value, it's SYSDATE now
NOW() returns a constant time that
indicates the time at which the
statement began to execute. (Within a
stored function or trigger, NOW()
returns the time at which the function
or triggering statement began to
execute.) This differs from the
behavior for SYSDATE(), which returns
the exact time at which it executes as
of MySQL 5.0.12.
http://dev.mysql.com/doc/refman/5.0/en/date-and-time-functions.html#function_now
In other words, it is executed only once when the statement is executed.
However, if you want to obtain the time at each execution you should use SYSDATE
As of MySQL 5.0.12, SYSDATE() returns
the time at which it executes. This
differs from the behavior for NOW(),
which returns a constant time that
indicates the time at which the
statement began to execute. (Within a
stored function or trigger, NOW()
returns the time at which the function
or triggering statement began to
execute.)
http://dev.mysql.com/doc/refman/5.0/en/date-and-time-functions.html#function_sysdate
Update:
Well, from what I know Order by will be executed or better said "used" only once. Since the value of TIMESTAMPDIFF(FRAC_SECOND, DateCreated , SYSDATE()) will be different every time you execute the SELECT statement. So, I think (once again I think) ORDER BY will consider either the first evaluated value of the timestampdiff or the last one. Anyway, I think by executing this - you will get a random order every time. Maybe there are better experts than me here who can answer better.