I have encountered a strange situation in which an SQL query takes several seconds to complete when run from Toad and a Jasper Report containing the same query takes over half an hour to produce results (with the same parameters). Here are some details:
I checked, and Oracle (version 11g) uses different execution plans in these two cases.
I considered using stored outlines, but the report slightly modifies the query (bind variables are renamed; in the case of multiple values, i.e. $P!{...}, the report simply inserts values into the query, and there are too many combinations of values to bypass this), so outlines won't work.
I ran the report in iReport 5.1 and via OpenReports and it takes about 35 minutes for both.
The original query is tuned with some hints, without them it takes comparably long to complete as the report.
I would appreciate any advice on how to deal with this.
First of all, don't use TOAD for query tuning. It is in TOAD's interest to present you the first few rows of the result set as fast as possible, to make the application as responsive as possible. To do so, TOAD injects a FIRST_ROWS hint to your query. A nice feature, but not for tuning queries.
Now, to address your query that's taking too long, I suggest you first investigate where time is being spent. You can do so by tracing a query execution, as explained here. Once you have done that, and you know where time is being spent, but you still don't know how to solve it, then please post the execution plan and statistics.
There are probably differences in the optimizer environment. You can check this using
select * from
V$SES_OPTIMIZER_ENV
where sid = sys_context(’userenv’,’sid’)
Run this in your toad session and in a Jasper report and compare the results.
Related
I haven been working with R interfacing my Oracle DB using the DBI package. I read that preparing a query is often a good practice when trying to query the same statment different times.
My question is, assuming infinite RAM to accomodate for the data downloaded, which factors may influence in the different run times between two scenarios: running a prepared query N times or using a WHERE ... BETWEEN filter?
Let's say I have to run a query to analyze some time series information between 2012 and 2018. I have found different download times between running a prepared query for each month between my analysis window and just filtering the whole window.
It depends on how the database optimizes your query. Maybe it chooses to optimize with an index when selecting just a single month, maybe it chooses to use a full table scan to retrieve the whole window at once.
Usually I would expect the query to retrieve the entire dataset at once to be more efficient than breaking it up in several parts per month.
Factors that play a role are, among others:
What percentage of rows in the table are you accessing?
Are there indexes that can be used?
Which data was recently accessed (and might be cached)?
How much data can the database handle/cache in memory?
Did you use bind variables for the statements?
I have a Query that's behaving differently for different dates. The Filer Condition in the Query is as below
Last_Update_Date>TO_DATE(SUBSTR('#BIAPPS.LAST_EXTRACT_DATE',0,19),'YYYY-MM-DD HH24:MI:SS')
The value for variable #BIAPPS.LAST_EXTRACT_DATE is being passed from a Application and datatype is Alphanumeric for #BIAPPS.LAST_EXTRACT_DATE.
If the Value passed to #BIAPPS.LAST_EXTRACT_DATE is 2017-12-20 00:00:00 this Query is extracting 200K records in 10 Mins.
If the Value passed to #BIAPPS.LAST_EXTRACT_DATE is 2018-01-02 00:00:00 this Query is extracting 80K records in 120 Mins.
Any Reason the Oracle behaves like this and Do I need to correct anything?
My guess is that in the first case it is using an index on LAST_EXTRACT_DATE, while in the second case it is not. (Another guess is that the reverse is true, and the query without the index is actually faster.)
The best first step to diagnose the problem is to view the execution plan for both queries. If you don't know how to do that, you may find questions related to it here. The quick way in SQLPlus is SET AUTOTRACE TRACEONLY EXPLAIN.
I assume you mean that you are doing text substitution into this query template, so from the Oracle parser's point of view they are two different queries. One possible solution to your problem would be to use a bind variable instead, so the parser would see both as the same query and use the same execution plan. (At least, it probably would; in recent versions of Oracle there can be more variation for the same query.) However, this could lead to a situation where you get the "bad" execution plan in more cases.
Based on the fact you are using very recent dates, a possible root cause is that the statistics on this table have been gathered sometime between the two dates in question. So the parser has a good estimate of how many records the first query will return based on recorded column statistics and/or histogram; but for the second query it needs to do an extrapolation since the date is outside the range of values recorded in the statistics. (I saw this a lot in a system I used to work on.)
In that case, another possible solution is to explicitly refresh statistics on that table every night. This may not help if the query uses today's date, but if all the queries use dates before today, it may work well.
There are also various ways to force/guide Oracle to use a desired execution plan. The old-fashioned one is explicit hinting. In this case, if my first guess was correct, you might add an INDEX hint to the query. There have been a number of features added to Oracle over the years to help with this. I think the current primary one is called "SQL Plan Management" so you could research that.
I find that the two queries given below when fired on PostgreSQL generate different query execution times:
Query1:
\timing
select s0.value,s1.value,s2.value,s3.value,s4.value
from (
select f0.subject as r0,f0.predicate as r1,f0.object as r2,f1.predicate as r3,f1.object as r4
from schemaName.facts f0,schemaName.facts f1
where f1.subject=f0.subject
) facts,schemaName.strings s0,schemaName.strings s1,schemaName.strings s2,schemaName.strings s3,schemaName.strings s4
where s0.id=facts.r0 and s1.id=facts.r1 and s2.id=facts.r2 and s3.id=facts.r3 and s4.id=facts.r4;
Query1 rewritten:
select s0.value,s1.value,s2.value,s3.value,s4.value
from schemaName.strings s0,schemaName.strings s1,schemaName.strings s2,schemaName.strings, schemaName.facts f0,schemaName.facts f1 s3,schemaName.strings s4
where s0.id=f0.subject and s1.id=f0.predicate and s2.id=f0.object and s3.id=f1.predicate and s4.id=f1.object, f0.subject=f1.subject;
I am unable to understand the reason behind postgresql generating different query execution times. Can someone please help me understand this?
Postgresql comes with a very nice command: EXPLAIN and EXPLAIN ANALYZE. The former prints out the query plan with estimates of how long things will take, and the latter outputs the query plan while actually running the query, which allows it to place the real execution costs with the plan.
Postgresql uses a whole mess of criteria and heuristics to decide how best to run a query. Everything from sequential and random access costs (tunable in the configs) to statistical samplings of the data in the tables.
I've found that very often it will come up with the same query plan give two radically different-looking queries (assuming they give the same results), and I've seen the query structure affect the plan. The best way to see what it is doing is to ask it to explain.
All of that said: the second run will always be faster than the first, since the data is now cached. So, if you are really trying to compare runtimes, be sure to run each query at least four times, drop the first one, and average the rest.
I am working on tuning a stored procedure. It is a huge stored proc and joins tables that has about 6-7 million records.
My question is how do I determine the time spent in the components of the proc. The proc has 1 big select with many temp tables created on the fly (read sub-queries).
I tried using SET STATISTICS TIME ON, SET SHOWPLAN_ALL ON.
I am looking to isolate a chunk of code that takes the most time and not sure of how to do it.
Please help.
PS: I did try to google it, searched on Stackoverflow..........No luck. Here is one question that I looked at
How to improve SQL Server query containing nested sub query
Any help is really appreciated. Thanks in advance.
I would try out SQL Sentry's SQL Plan Explorer. It gives you visual help in finding the problem. It is also a free tool. It highlights the bits that cost a lot of I/O or CPU, versus a generic percent.
Here's where you can check it out:
http://www.sqlsentry.net/plan-explorer/sql-server-query-view.asp
Eric
I realize your asking for "time" (the how long), but maybe you should focus on the "what". What I mean is, tuning to the results of Execution Plan. Ideally using the "Show Execution Plan" is going to give you the biggest bang. And it will tell you, via percentages where it is cost the most resources.
If you are in SSMS 2008 you can right click in your query window and click "Include Execution Plan".
In your scenario, the best way to do this is to just run the components individually. Bear in mind the below is relevant for tuning for execution time primarily (in a low-contingency/concurrency environment). You may have other priorities under a heavy concurrent load.
I have to do a very similar break down on a regular basis for different procedures I have to tune. As a rule the general methodology I follow is:
1 - Do a baseline run
2 - Add PRINT or RAISERROR commands between portions that return the current time to aid in identifying which steps take the longest.
3 - Break down the queries individually. I normally run portions on their own (omit JOIN conditions) to see what the variance is. If it is a very long-running query you can add a TOP clause to any SELECTs to limit the returns. As long as you are consistent this will still give you a good idea.
4 - Tweak the components from step 3 that take the most time. If you have complicated subqueries, maybe make them indexed #temp tables to see if that helps. CTEs as a rule never help performance, so you may need to materialize those as well.
Is there any hints in Oracle that works the same way as these SQL Server hints?
Recompile: That a query is recompiled every time it's run (if execution plans should vary greatly depending on parameters). Would this be best compared to cursor_sharing in Oracle?
Optimize for: When you want the plan to get optimized for a certain parameter even if a different one is used the first time the SQL is run? I guess maybe could be helped with cursor_sharing as well?
Since you're using 11g, Oracle should use adaptive cursor sharing by default. If you have a query that uses bind variables and the histogram on the column with skewed data indicates that different bind variable values should use different query plans, Oracle will maintain multiple query plans for the same SQL statement automatically. There would be no need to specifically hint the queries to get this behavior, it's already baked in to the optimizer.
I didn't know, but found a discussion with some solutions here on forums.oracle.com