Log join and where clause execution time in Oracle - sql

I would like to ask for some help with a query in Oracle Database.
I have a massive select with multiple tables joined together (over 10) and multiple where clauses applied (10-20). Some tables have 10 columns, some has 300+. Most tables have 10+ million rows, some of them even 60+ million.
The execution time is usually between 25 and 45 minutes, sometimes it drops to 30 seconds. Monitoring the server load shows, that the load was almost the same.
We would like to optimize the select to reduce the usual execution time to 10-15 minutes or less.
My question is: Is there any tool or technique which can provide me information about which part of the query ran so long (like something, that can show me that in the last execution of the query, the 1st join took 36 secs, the 2nd join 40 secs, 1st where clause 10 secs etc.)?
(Note, that i'm not asking for optimization advice but for any tool or technique which can provide me information about which part/operation of executed query took so long)
Thanks in advance, I hope I was clear! :)

One option is to do the following:
add /*+ gather_plan_statistics */ to your query
execute the query
after the query, select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
This gives you a plan with columns like actual rows, actual time, memory usage, and more.
If you don't want to re-run the query you can generate actual rows and times of the last execution using a SQL Monitor Report like this:
select dbms_sqltune.report_sql_monitor(sql_id => ' add the sql_id here') from dual;
Using these tools allow you to focus on the relevant operation. A plain old explain plan isn't good enough for complex queries. AWR doesn't focus on individual queries. And tracing is a huge waste of time when there are faster alternatives.

Related

Despite the existence of relevant indices, PostgreSQL query is slow

I have a table with 3 columns:
time (timestamptz)
price (numeric(8,2))
set_id (int)
The table contains 7.4M records.
I've created an simple index for time and an index for set_id.
I want to run the following query:
select * from test_prices where time BETWEEN '2015-06-05 00:00:00+00' and '2020-06-05 00:00:00+00';
Depsite my indices, the query takes 2 minutes and 30 seconds.
See explain analze stats: https://explain.depesz.com/s/ZwSH
GCP postgres DB has the following stats:
What do I miss here? Why is this query so slow and how can I improve?
According to your explain plan, the row is returning 1.6 million rows out of 4.5 million. That means that a significant portion of rows are being returned.
Postgres wisely decides that a full table scan is more efficient than using an index, because there is a good chance that all the data pages will need to be read anyway.
It is surprising that you are reporting 00:02:30 for the query. The explain is saying that the query completes in about 1.4 seconds -- which seems reasonable.
I suspect that the elapsed time is caused by the volume of data being returned (perhaps the rows are very wide), a slow network connection to the database, or contention on the database/server.
Your query selects two thirds of the table. A sequential scan is the most efficient way to process such a query.
Your query executes in under 2 seconds. It must be your client that takes a long time to render the query result (pgAdmin is infamous for that). Use a different client.

How Can an Always-False PostgreSQL Query Run for Hours?

I've been trying to debug a slow query in another part of our system, and saw this query is active:
SELECT * FROM xdmdf.hematoma AS "zzz4" WHERE (0 = 1)
It has apparently been active for > 8 hours. With that WHERE clause, logically, this query should return zero rows. Why would a SQL engine even bother to evaluate it? Would a query like this be useful for anything, and if so, what could it be?
(xdmdf.hematoma is a view and I would expect SELECT * on it to take ~30 minutes under non-locky conditions.)
This statement:
explain select 1 from xdmdf.hematoma limit 1
(no analyze) has been running for about 10 minutes now.
There are two possibilities:
It takes forever to plan the statement, because you changed some planner settings and the view definition is so complicated (partitioned table?).
This is the unlikely explanation.
A concurrent transaction is holding an ACCESS EXCLUSIVE lock on a table involved in the view definition.
Terminate any such concurrent transactions.

What's the curve for a simple select query?

This is a conceptual question.
Hypothetically, when do select * from table_name where the table has 1 million records it takes about 3 secs.
Similarly, when I select 10 million records the time taken is about 30 secs. But I am told the selection of records is not linearly proportional to time. After a certain number, the time required to select records increases exponentially?
Please help me understand how this works?
THere are things that can make one query take longer than the other even simple selects with no where clauses or joins.
First, the time to return the query depends on how busy the network is at the time the query is run. It could also depend on whether there are any locks on the data or how much memory is available.
It also depends on how wide the tables are and in general how many bytes an individual record would have. For instance I would expect that a 10 million record table that only has two columns both ints would return much faster than a million record table that has 50 columns including some large columns epecially if they are things like documents stored as database objects or large fields that have too much text to fit into an ordinary varchar or nvarchar field (in sql server these would be nvarchar(max) or text for instance). I would expect this becasue there is simply less total data to return even though more records.
As you start adding where clauses and joins of course there are many more things that affect performance of an indivuidual query. If you query datbases, you should read a good book on performance tuning for your particular database. There are many things you can do without realizing it that can cause queries to run more slowly than need be. You should learn the techniques that create the queries most likely to be performant.
I think this is different for each database-server. Try to monitor the performance while you fire your queries (what happens to the memory, and CPU?)
Eventually all hardware components have a bottleneck. If you come close to that point the server might 'suffocate'.

query slow once an hour or so, takes less than 100 ms 99% of the time

I am troubleshooting a slow query, it runs in less than 100 ms 99% of the time, but once in an hr (or two no pattern, i guess), goes bad and does 6 million reads and takes 11 seconds! I saw the query plan, it does do a clustered index scan, I noticed the cached_plans dynamic management view use counts column keeps increasing every time the query executes, so i am thinking its the same plan, just wondering why at one point it goes out-of-whack! any pointers will be helpful. I haven't tried anything as it runs pretty fast most of the time.
First something could easily be blocking the query to make it run slow. Otr there could be other things happening onthe server at the same time that are consuming most of its resources.
Next, the parameters of the query might be bad for the saved execution plan.
Or the statistics might be out of date
Or if the query is an action query as opposed to a select, the particular parameters may be causing a problem in a trigger that makes it take longer.
Or teh query might be returning significanlty more results at times. If you run it at 10 and return 10 results and an import puts more records inteh table that meet the query conditions, at 10:30 you might return a million results which would clearly be slower.
One thing I like to do in such circumstances is set up logging so that the exact query is logged with the time at the time of execution. Then you can see what the query that ran sloweractaully was if you have varaible , than might be differnt from run to run.

After writing SQL statements in MySQL, how to measure the speed / performance of them?

I saw something from an "execution plan" article:
10 rows fetched in 0.0003s (0.7344s)
(the link: http://explainextended.com/2009/09/18/not-in-vs-not-exists-vs-left-join-is-null-mysql/ )
How come there are 2 durations shown? What if I don't have large data set yet. For example, if I have only 20, 50, or even just 100 records, I can't really measure how faster 2 different SQL statements compare in term of speed in real life situation? In other words, there needs to be at least hundreds of thousands of records, or even a million records to accurately compares the performance of those 2 different SQL statements?
For your first question:
X row(s) fetched in Y s (Z s)
X = number of rows (of course);
Y = time it took the MySQL server to execute the query (parse, retrieve, send);
Z = time the resultset spent in transit from the server to the client;
(Source: http://forums.mysql.com/read.php?108,51989,210628#msg-210628)
For the second question, you will never ever know how the query performs unless you test with a realistic number of records. Here is a good example of how to benchmark correctly: http://www.mysqlperformanceblog.com/2010/04/21/mysql-5-5-4-in-tpcc-like-workload/
That blog in general as well as the book "High Performance MySQL" is a goldmine.
The best way to test and compare performance of operations is often (if not always !) to work with a realistic set of data.
If you plan on having millions of rows when your application is in production, then, you should test with millions of rows right now, and not only a dozen !
A couple of tips :
While benchmarking, use select SQL_NO_CACHE ..., instead of select ...
This will prevent MySQL from using its query cache (which would make the first query take a normal amount of time, and re-executing it several times a lot faster)
Learn how to use EXPLAIN, and understand its output
Read the Chapter 7. Optimization section of the manual ;-)
Generally when there are 2 times shown, one is CPU time and one is wall-clock time. I cannot recall which is which, but it appears that the first is the CPU time and the second is elapsed time.