Measure query execution time excluding start-up cost in postgres - sql

I want to measure the total time taken by postgres to execute my query excluding the start-up cost. Earlier I was using \timing but now I found \timing includes start-up cost.
I also tried: "explain analyze" in which I found that actual time is specified in a particular format like: actual time=12.04..12.09
So, does this mean that the time taken to execute postgres query excluding start-up time is 0.05. If not, then is there a way to exclude start-up costs and measure query execution time?

What you want is actually quite ill-defined.
"Startup cost" could mean:
network connection overhead and back-end start cost of establishing a new connection. Avoided by re-using the same session.
network round-trip times for sending the query and getting the results. Avoided by measuring the timing server-side with log_statement_min_duration = 0 or (with timing overhead) using explain analyze or the auto_explain module.
Query planning time. Avoided by PREPAREing the query, then timing only the subsequent EXECUTE.
Lock acquisition time. There is not currently any way to exclude this.
Note that using EXPLAIN ANALYZE may not be ideal for your purposes: it throws the query result away, and it adds its own costs because of the detailed timing it does. I would set log_statement_min_duration = 0, set client_min_messages appropriately, and capture the timings from the log output.
So it sounds like you want to PREPARE a query then EXPLAIN ANALYZE EXECUTE it or just EXECUTE it with log_statement_min_duration set to 0.

For exploring PLANNING costs and EXECUTE costs separately you need to set on several postgres.conf parameters:
log_planner_stats = on
log_executor_stats = on
and explore your log file.
Update:
1. find your config file location with executing:
SHOW config_file;
2. Set parameters. Don't foget to remove comment-symbol '#'.
3. Restart postgresql service
4. Execute your query
5. Explore your log file.

Related

Is there a way to check if a query hit a cached result or not in BigQuery?

We are performance tuning, both in terms of cost and speed, some of our queries and the results we get are a little bit weird. First, we had one query that did an overwrite on an existing table, we stopped that one after 4 hours. Running the same query to an entirely new table and it only took 5 minutes. I wonder if the 5 minute query maybe used a cached result from the first run, is that possible to check? Is it possible to force BigQuery to not use cache?
If you run query in UI - expand Options and make sure Use Cached Result properly set
Also, in UI, you can check Job Details to see if cached result was used
If you run your query programmatically - you should use respective attributes - configuration.query.useQueryCache and statistics.query.cacheHit

out of memory sql execution

I have the following script:
SELECT
DEPT.F03 AS F03, DEPT.F238 AS F238, SDP.F04 AS F04, SDP.F1022 AS F1022,
CAT.F17 AS F17, CAT.F1023 AS F1023, CAT.F1946 AS F1946
FROM
DEPT_TAB DEPT
LEFT OUTER JOIN
SDP_TAB SDP ON SDP.F03 = DEPT.F03,
CAT_TAB CAT
ORDER BY
DEPT.F03
The tables are huge, when I execute the script in SQL Server directly it takes around 4 min to execute, but when I run it in the third party program (SMS LOC based on Delphi) it gives me the error
<msg> out of memory</msg> <sql> the code </sql>
Is there anyway I can lighten the script to be executed? or did anyone had the same problem and solved it somehow?
I remember having had to resort to the ROBUST PLAN query hint once on a query where the query-optimizer kind of lost track and tried to work it out in a way that the hardware couldn't handle.
=> http://technet.microsoft.com/en-us/library/ms181714.aspx
But I'm not sure I understand why it would work for one 'technology' and not another.
Then again, the error message might not be from SQL but rather from the 3rd-party program that gathers the output and does so in a 'less than ideal' way.
Consider adding paging to the user edit screen and the underlying data call. The point being you dont need to see all the rows at one time, but they are available to the user upon request.
This will alleviate much of your performance problem.
I had a project where I had to add over 7 million individual lines of T-SQL code via batch (couldn't figure out how to programatically leverage the new SEQUENCE command). The problem was that there was limited amount of memory available on my VM (I was allocated the max amount of memory for this VM). Because of the large amount lines of T-SQL code I had to first test how many lines it could take before the server crashed. For whatever reason, SQL (2012) doesn't release the memory it uses for large batch jobs such as mine (we're talking around 12 GB of memory) so I had to reboot the server every million or so lines. This is what you may have to do if resources are limited for your project.

SQL_ID, CBO, Optimizer_Mode and Plan Change History

This is a 4 part question
What is the logic behind SQL_ID. . . Does the value change for the same SQL over time? Does it persists between DB Restarts? Or every plan change gives a new SQL_ID?
How can i check the plan change history for a particular query? Given the SQL_ID i tried querying dba_hist_sqlstat table but it does not give the time of plan change and other details so as to be able to match with the v$sql_plan table.
I have the parameter optimizer_mode set to FIRST_ROWS. Even then when is see the table dba_hist_sqlstat, it indicate ALL_ROWS for some SQL_IDs . . . Can oracle disregard the instance level parameter to use what it deems most suitable?
Between 8PM and 2 PM a query was performing badly. Taking 6 seconds for its execution. After 3 PM the query started responding in < 1 Second. I have the AWR report for the periods that shows this detail. There was no difference in load on the DB in these 2 periods. How could i arrive at the root of this? I am trying to find the History of the plan change but appreciate more feedback to best analyze such issues.
The DB Version is Oracle 10.1.0.4 running on AIX 5.3 64b
1.- SQLID is calculated with a hash function based on the actual SQL text, it shouldnt change with restart or between databases at least the same versions, different oracle versions could have different hashing functions right?, so as long as you do not change the sql text (this includes blanks commas and everything) SQLID will remain the same.
2.- Use DBMS_XPLAN.DISPLAY_AWR to display all plans for a SQL_ID: select * from table(dbms_xplan.display_awr(sql_id => '[your SQL_ID]'));
3.- Oracle will only do this if a query has an OPTIMIZER GOAL hint in it.
4.- There are many things in play for this one. I would start by looking at top 5 timed events in AWR for both periods of time. If they are alike, I would then go investigate the PLAN history for the statement, see if it changed during periods and how data behaved during the periods as well. One of these three should give you the answer.

Time taken to display output in postgres 9.1

Does postgres include the time taken for rendering the output on screen within \timing or explain analyze. From what I understand it does not. Am I correct?
Actually I am outputting a lot of rows on screen and I find that postgres does not take much time to display them on screen, whereas if I write a simple C program to output the results, then the C programs takes about 3000ms. Whereas postgres takes about 500ms to display the same data on screen.
"postgres" doesn't display anything at all. I think you mean the psql client.
If so: \timing displays time including the time to receive the data from the server. EXPLAIN ANALYZE doesn't, but adds the overhead of doing the detailed server-side timing. log_min_duration_statement just records statement timings server-side.

Is my execution plan trying to trick me?

I am trying to speed up a long running query that I have (takes about 10 minutes to run...). In order to track down what part of the query is costing me the most time I included the Actual Execution Plan when I ran it and found a particular section that was taking up 55% (screen shot below)
alt text http://img109.imageshack.us/img109/9571/53218794.png
This didn't quite seem right to me so I added Print '1' and Print '2' before and after this trouble section. When I run the query for a mere 17 seconds and then cancel it the 1 and 2 print out which I'm assuming means it's getting through that section in the first 17 seconds.
alt text http://img297.imageshack.us/img297/4739/66797633.png
Am I doing something wrong here or is my Execution plan misleading me?
Metrics from perfmon will also help figure out what's going wrong... you could be running into some serious IO issues with the drive your tempDB is residing on. Additionally, run a trace and look at the CPU & IO of the actual run.
Good perfmon metrics to look at are disk queue length (avg & writes).
If you don't have access to perfmon or don't want to trace things, use "SET STATISTICS IO ON" at the beginning of your query and allow it to complete...don't stop it. Just because an execution plan says it's taking over have the cost doesn't mean it will run for half of the query time...it could be much more (or less).
It says Query 10: Query cost (relative to the batch): 55%. Are 100% positive that it is the 10th statement in the batch that you surounded with Print statements? Could the INSERT ... INTO #mpProgramSet2 execute multiple times, some times in under 17 seconds other time for 5 minutes, depending on how much data was selected/inserted?
As a side note you should run with SET STATISTICS TIME ON rather that prints, this will give you exact compile/time and execution time of each statement in the batch.
I wouldn't trust that printing the '1' and '2' will prove anything about what has executed and what has not. I do the same thing, but I just wouldn't rely on it as proof. You could print the ##rowcount from that first insert query - that would indicate for sure that the insert has occurred.
Although the plan says that query may take 55% of the cost, it may not be 55% of the execution time, especially if the query results are cached.
Another advantage of printing the ##rowcount is to compare the actual number of rows to the estimated rows (51K). If they differ by a lot then you might investigate the statistics for your indexes.
We would need the full query to understand what's going on; but I would probably start with setting MAXDOP to 1 in order to limit the number of processors it's running on.
Note that sometimes queries need to be limited to only 1 processor due to locks etc.
Further you might try adding NOLOCKs to any of your selects which can get away with dirty reads.