Postgres EXPLAIN ANALYZE is much faster than running the query normally

Postgres EXPLAIN ANALYZE is much faster than running the query normally - sql

I'm trying to optimise a PostgreSQL 8.4 query. After greatly simplifying the original query, trying to figure out what's making it choose a bad query plan, I got to the point where running the query under EXPLAIN ANALYZE takes only 0.5s, while running it normally takes 2.8s. It seems obvious then, that what EXPLAIN ANALYZE is showing me is not what it normally does, so whatever it's showing me is useless, isn't it? What is going on here and how do I see what it's really doing?

Most likely, the data pages are in the OS disk cache when you are manually running with EXPLAIN ANALYZE in order to try and optimize the query. When run in a normal environment, the pages probably aren't in the cache already and have to be fetched from disk, increasing the runtime.

It shows less time because:
1) The Total runtime shown by EXPLAIN ANALYZE includes executor start-up and shut-down time, as well as the time to run any triggers that are fired, but it does not include parsing, rewriting, or planning time.
2)Since no output rows are delivered to the client, network transmission costs and I/O conversion costs are not included.
Warning!
The measurement overhead added by EXPLAIN ANALYZE can be significant, especially on machines with slow gettimeofday() operating-system calls. So, it's advisable to use EXPLAIN (ANALYZE TRUE, TIMING FALSE).

Related

How do I compare two SQL queries to run on Postgres

I need to compare two queries that will run in my Postgres database.
How do I know the execution time and any other statistics of them so I can produce a reliable benchmark between them?

I can think of two interesting data points to collect and compare:
The execution time.
For that, simply execute the query using psql connected via UNIX sockets (to factor out the network) and use psql's \timing command to measure the execution time as seen on the client.
Do not use EXPLAIN (ANALYZE) for that since that would add notable overhead which affects your measurements.
Make sure to run the query several times to get a reliable number. That number will correspond to the execution time with a warm cache.
If you want to measure execution time with a cold cache, restart PostgreSQL and empty the file system cache.
The number of blocks touched by the query.
For that, run EXPLAIN (ANALYZE, BUFFERS) once for each query.
The number of blocks touched is significant for performance: the fewer blocks a query touches, the faster it will (often) be. This number is particularly significant for performance with a cold cache; the fewer blocks, the less execution time will depend on caching.

Testing for Performance in Multiple SQL Queries

I'm working to improve the efficiency of some SQL Queries on SQL-Server-2008. There are different ways of performing each query and I want to find the fastest of them.
However, the issue that I'm having is that I am having trouble determining which is actually executing faster. Ideally I could just run each query one after the other and see which runs fastest. Ideally...
Problem is, is that SQL is too smart for my liking. When constructing these queries I run them multiple times. When I do this, the queries' efficiencies improve on their own. This I would imaged is because of some behind-the-scenes stuff that SQL does. What is this? How can I avoid it?
For example, I run the query once and it takes 30s. I run it again and it takes 10s. The more I run the query the faster it seems to run.
So.. Is there any way of "clearing the cache" or whatever the equivalent would be in SQL? I want to get an accurate indication of which query is going to actually run faster. Alternatively, what would be the best way to do the type of testing that I want?
Any information in regards to this topic would be accepted as valid input.

When the query is run first most likely the data is still on disk, SQl Server has to fetch this data, when you run the same query the data is already in RAM and thus it will be much faster than going to disk
run DBCC DROPCLEANBUFFERS and DBCC FREEPROCCACHE to clear the cache without doing a restart
You need to look at execution plans, statistics io and statistics time to really see what is going on. in the plan look for conversions and also for scans (you want seeks if possible).
See also Client Statistics in SSMS. Check execution times

The improvement in speed that you see is a result of the database's query cache. Most relational DB engines have this feature, which caches the result of a query until the table(s) you read from are updated.
This post gives good pointers on how to work around this for performance tuning. You should also look into Execution Plans, which show you how the database would run the query, without actually running it. The benefit of this is that you can see if full table scans are being done where an index could be used instead.

Include the Actual Execution Plan and execute the following command:
CHECKPOINT;
GO
DBCC DROPCLEANBUFFERS;
GO

Stored Procedure Execution Plan - Data Manipulation

I have a stored proc that processes a large amount of data (about 5m rows in this example). The performance varies wildly. I've had the process running in as little as 15 minutes and seen it run for as long as 4 hours.
For maintenance, and in order to verify that the logic and processing is correct, we have the SP broken up into sections:
TRUNCATE and populate a work table (indexed) we can verify later with automated testing tools.
Join several tables together (including some of these work tables) to product another work table
Repeat 1 and/or 2 until a final output is produced.
My concern is that this is a single SP and so gets an execution plan when it is first run (even WITH RECOMPILE). But at that time, the work tables (permanent tables in a Work schema) are empty.
I am concerned that, regardless of the indexing scheme, the execution plan will be poor.
I am considering breaking up the SP and calling separate SPs from within it so that they could take advantage of a re-evaluated execution plan after the data in the work tables is built. I have also seen reference to using EXEC to run dynamic SQL which, obviously might get a RECOMPILE also.
I'm still trying to get SHOWPLAN permissions, so I'm flying quite blind.

Are you able to determine whether there are any locking problems? Are you running the SP in sufficiently small transactions?
Breaking it up into subprocedures should have no benefit.
Somebody should be concerned about your productivity, working without basic optimization resources. That suggests there may be other possible unseen issues as well.

Grab the free copy of "Dissecting Execution Plan" in the link below and maybe you can pick up a tip or two from it that will give you some idea of what's really going on under the hood of your SP.
http://dbalink.wordpress.com/2008/08/08/dissecting-sql-server-execution-plans-free-ebook/

Are you sure that the variability you're seeing is caused by "bad" execution plans? This may be a cause, but there may be a number of other reasons:
"other" load on the db machine
when using different data, there may be "easy" and "hard" data
issues with having to allocate more memory/file storage
...
Have you tried running the SP with the same data a few times?
Also, in order to figure out what is causing the runtime/variability, I'd try to do some detailed measuring to pin the problem down to a specific section of the code. (Easiest way would be to insert some log calls at various points in the sp). Then try to explain why that section is slow (other than "5M rows ;-)) and figure out a way to make that faster.
For now, I think there are a few questions to answer before going down the "splitting up the sp" route.

You're right it is quite difficult for you to get a clear picture of what is happening behind the scenes until you can get the "actual" execution plans from several executions of your overall process.
One point to consider perhaps. Are your work tables physical of temporary tables? If they are physical you will get a performance gain by inserting new data into a new table without an index (i.e. a heap) which you can then build an index on after all the data has been inserted.
Also, what is the purpose of your process. It sounds like you are moving quite a bit of data around, in which case you may wish to consider the use of partitioning. You can switch in and out data to your main table with relative ease.
Hope what I have detailed is clear but please feel free to pose further questions.
Cheers, John

In several cases I've seen this level of diversity of execution times / query plans comes down to statistics. I would recommend some tests running update stats against the tables you are using just before the process is run. This will both force a re-evaluation of the execution plan by SQL and, I suspect, give you more consistent results. Additionally you may do well to see if the differences in execution time correlate with re-indexing jobs by your dbas. Perhaps you could also gather some index health statistics before each run.
If not, as other answerers have suggested, you are more likely suffering from locking and/or contention issues.
Good luck with it.

The only thing I can think that an execution plan would do wrong when there's no data is err on the side of using a table scan instead of an index, since table scans are super fast when the whole table will fit into memory. Are there other negatives you're actually observing or are sure are happening because there's no data when an execution plan is created?
You can force usage of indexes in your query...
Seems to me like you might be going down the wrong path.

Is this an infeed or outfeed of some sort or are you creating a report? If it is a feed, I would suggest that you change the process to use SSIS which should be able to move 5 million records very fast.

Performance Tuning PostgreSQL

Keep in mind that I am a rookie in the world of sql/databases.
I am inserting/updating thousands of objects every second. Those objects are actively being queried for at multiple second intervals.
What are some basic things I should do to performance tune my (postgres) database?

It's a broad topic, so here's lots of stuff for you to read up on.
EXPLAIN and EXPLAIN ANALYZE is extremely useful for understanding what's going on in your db-engine
Make sure relevant columns are indexed
Make sure irrelevant columns are not indexed (insert/update-performance can go down the drain if too many indexes must be updated)
Make sure your postgres.conf is tuned properly
Know what work_mem is, and how it affects your queries (mostly useful for larger queries)
Make sure your database is properly normalized
VACUUM for clearing out old data
ANALYZE for updating statistics (statistics target for amount of statistics)
Persistent connections (you could use a connection manager like pgpool or pgbouncer)
Understand how queries are constructed (joins, sub-selects, cursors)
Caching of data (i.e. memcached) is an option
And when you've exhausted those options: add more memory, faster disk-subsystem etc. Hardware matters, especially on larger datasets.
And of course, read all the other threads on postgres/databases. :)

First and foremost, read the official manual's Performance Tips.
Running EXPLAIN on all your queries and understanding its output will let you know if your queries are as fast as they could be, and if you should be adding indexes.
Once you've done that, I'd suggest reading over the Server Configuration part of the manual. There are many options which can be fine-tuned to further enhance performance. Make sure to understand the options you're setting though, since they could just as easily hinder performance if they're set incorrectly.
Remember that every time you change a query or an option, test and benchmark so that you know the effects of each change.

Actually there are some simple rules which will get you in most cases enough performance:
Indices are the first part. Primary keys are automatically indexed. I recommend to put indices on all foreign keys. Further put indices on all columns which are frequently queried, if there are heavily used queries on a table where more than one column is queried, put an index on those columns together.
Memory settings in your postgresql installation. Set following parameters higher:
.
shared_buffers, work_mem, maintenance_work_mem, temp_buffers
If it is a dedicated database machine you can easily set the first 3 of these to half the ram (just be carefull under linux with shared buffers, maybe you have to adjust the shmmax parameter), in any other cases it depends on how much ram you would like to give to postgresql.
http://www.postgresql.org/docs/8.3/interactive/runtime-config-resource.html

http://wiki.postgresql.org/wiki/Performance_Optimization

The absolute minimum I'll recommend is the EXPLAIN ANALYZE command. It will show a breakdown of subqueries, joins, et al., all the time showing the actual amount of time consumed in the operation. It will also alert you to sequential scans and other nasty trouble.
It is the best way to start.

Put fsync = off in your posgresql.conf, if you trust your filesystem, otherwise each postgresql operation will be imediately written to the disk (with fsync system call).
We have this option turned off on many production servers since quite 10 years, and we never had data corruptions.

How do I clear oracle execution plan cache for benchmarking?

On oracle 10gr2, I have several sql queries that I am comparing performance. But after their first run, the v$sql table has the execution plan stored for caching, so for one of the queries I go from 28 seconds on first run to .5 seconds after.
I've tried
ALTER SYSTEM FLUSH BUFFER_CACHE;
After running this, the query consistently runs at 5 seconds, which I do not believe is accurate.
Thought maybe deleting the line item itself from the cache:
delete from v$sql where sql_text like 'select * from....
but I get an error about not being able to delete from view.

Peter gave you the answer to the question you asked.
alter system flush shared_pool;
That is the statement you would use to "delete prepared statements from the cache".
(Prepared statements aren't the only objects flushed from the shared pool, the statement does more than that.)
As I indicated in my earlier comment (on your question), v$sql is not a table. It's a dynamic performance view, a convenient table-like representation of Oracle's internal memory structures. You only have SELECT privilege on the dynamic performance views, you can't delete rows from them.
flush the shared pool and buffer cache?
The following doesn't answer your question directly. Instead, it answers a fundamentally different (and maybe more important) question:
Should we normally flush the shared pool and/or the buffer cache to measure the performance of a query?
In short, the answer is no.
I think Tom Kyte addresses this pretty well:
http://www.oracle.com/technology/oramag/oracle/03-jul/o43asktom.html
http://www.oracle.com/technetwork/issue-archive/o43asktom-094944.html
<excerpt>
Actually, it is important that a tuning tool not do that. It is important to run the test, ignore the results, and then run it two or three times and average out those results. In the real world, the buffer cache will never be devoid of results. Never. When you tune, your goal is to reduce the logical I/O (LIO), because then the physical I/O (PIO) will take care of itself.
Consider this: Flushing the shared pool and buffer cache is even more artificial than not flushing them. Most people seem skeptical of this, I suspect, because it flies in the face of conventional wisdom. I'll show you how to do this, but not so you can use it for testing. Rather, I'll use it to demonstrate why it is an exercise in futility and totally artificial (and therefore leads to wrong assumptions). I've just started my PC, and I've run this query against a big table. I "flush" the buffer cache and run it again:
</excerpt>
I think Tom Kyte is exactly right. In terms of addressing the performance issue, I don't think that "clearing the oracle execution plan cache" is normally a step for reliable benchmarking.
Let's address the concern about performance.
You tell us that you've observed that the first execution of a query takes significantly longer (~28 seconds) compared to subsequent executions (~5 seconds), even when flushing (all of the index and data blocks from) the buffer cache.
To me, that suggests that the hard parse is doing some heavy lifting. It's either a lot of work, or its encountering a lot of waits. This can be investigated and tuned.
I'm wondering if perhaps statistics are non-existent, and the optimizer is spending a lot of time gathering statistics before it prepares a query plan. That's one of the first things I would check, that statistics are collected on all of the referenced tables, indexes and indexed columns.
If your query joins a large number of tables, the CBO may be considering a huge number of permutations for join order.
A discussion of Oracle tracing is beyond the scope of this answer, but it's the next step.
I'm thinking you are probably going to want to trace events 10053 and 10046.
Here's a link to an "event 10053" discussion by Tom Kyte you may find useful:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:63445044804318
tangentially related anecdotal story re: hard parse performance
A few years back, I did see one query that had elapsed times in terms of MINUTES on first execution, subsequent executions in terms of seconds. What we found was that vast majority of the time for the first execution time was spent on the hard parse.
This problem query was written by a CrystalReports developer who innocently (naively?) joined two humongous reporting views.
One of the views was a join of 62 tables, the other view was a join of 42 tables.
The query used Cost Based Optimizer. Tracing revealed that it wasn't wait time, it was all CPU time spent evaluating possible join paths.
Each of the vendor supplied "reporting" views wasn't too bad by itself, but when two of them were joined, it was agonizingly slow. I believe the problem was the vast number of join permutations that the optimizer was considering. There is an instance parameter that limits the number of permutations considered by the optimizer, but our fix was to re-write the query. The improved query only joined the dozen or so tables that were actually needed by the query.
(The initial immediate short-term "band aid" fix was to schedule a run of the query earlier in the morning, before report generation task ran. That made the report generation "faster", because the report generation run made use of the already prepared statement in the shared pool, avoiding the hard parse.
The band aid fix wasn't a real solution, it just moved the problem to a preliminary execution of the query, when the long execution time wasn't noticed.
Our next step would have probably been to implement a "stored outline" for the query, to get a stable query plan.
Of course, statement reuse (avoiding the hard parse, using bind variables) is the normative pattern in Oracle. It mproves performance, scalability, yada, yada, yada.
This anecdotal incident may be entirely different than the problem you are observing.
HTH

It's been a while since I worked with Oracle, but I believe execution plans are cached in the shared pool. Try this:
alter system flush shared_pool;
The buffer cache is where Oracle stores recently used data in order to minimize disk io.

We've been doing a lot of work lately with performance tuning queries, and one culprit for inconsistent query performance is the file system cache that Oracle is sitting on.
It's possible that while you're flushing the Oracle cache the file system still has the data your query is asking for meaning that the query will still return fast.
Unfortunately I don't know how to clear the file system cache - I just use a very helpful script from our very helpful sysadmins.

FIND ADDRESS AND HASH_VALUE OF SQL_ID
select address,hash_value,inst_id,users_executing,sql_text from gv$sqlarea where sql_id ='7hu3x8buhhn18';
PURGE THE PLAN FROM SHARED POOL
exec sys.dbms_shared_pool.purge('0000002E052A6990,4110962728','c');
VERIFY
select address,hash_value,inst_id,users_executing,sql_text from gv$sqlarea where sql_id ='7hu3x8buhhn18';

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas