Which metrics to compare when evaluating SQL query performance? - sql

I recently watched an online course about oracle SQL performance tuning. In the video, the lecturer constantly compares the COST value from the Autotrace when comparing the performance of two queries.
But I've also read from other forums and websites where it states that COST is a relative value specific to that query and should not be used for an absolute metric for evaluating performance. They suggest looking at things like consistent gets, physical reads, etc instead.
So my interpretation is that it makes no sense to compare the COST value for completely different queries that are meant for different purposes because the COST value is relative. But when comparing the same 2 queries, one which has been slightly modified for "better performance", it is okay to compare the COST values. Is my interpretation accurate?
When is it okay to compare the COST value as opposed to some other metric?
What other metrics should we look at when evaluating/comparing query performance?

In general, I would be very wary about comparing the cost between two queries unless you have a very specific reason to believe that makes sense.
In general, people don't look at the 99.9% of queries that the optimizer produces a (nearly) optimal plan for. People look at queries where the optimizer has produced a decidedly sub-optimal plan. The optimizer will produce a sub-optimal plan for one of two basic reasons-- either it can't transform a query into a form it can optimize (in which case a human likely needs to rewrite the query) or the statistics it is using to make its estimates are incorrect so what it thinks is an optimal plan is not. (Of course, there are other reasons queries might be slow-- perhaps the optimizer produced an optimal plan but the optimal plan is doing a table scan because an index is missing for example.)
If I'm looking at a query that is slow and the query seems to be reasonably well-written and a reasonable set of indexes are available, statistics are the most likely source of problems. Since cost is based entirely on statistics, however, that means that the optimizer's cost estimates are incorrect. If they are incorrect, the cost is roughly equally likely to be incorrectly high or incorrectly low. If I look at the query plan for a query that I know needs to aggregate hundreds of thousands of rows to produce a report and I see that the optimizer has assigned it a single-digit cost, I know that somewhere along the line it is estimating that a step will return far too few rows. In order to tune that query, I'm going to need the cost to go up so that the optimizer's estimates accurately reflect reality. If I look at the query plan for a query I know should only need to scan a handful of rows and I see a cost in the tens of thousands, I know that the optimizer is estimating that some step will return far too many rows. In order to tune that query, I'm going to need the cost to go down so that the optimizer's estimates reflect reality.
If you use the gather_plan_statistics hint, you'll see the estimated and actual row counts in your query plan. If the optimizer's estimates are close to reality, the plan is likely to be pretty good and cost is likely to be reasonably accurate. If the optimizer's estimates are off, the plan is likely to be poor and the cost is likely to be wrong. Trying to use a cost metric to tune a query without first confirming that the cost is reasonably close to reality is seldom very productive.
Personally, I would ignore cost and focus on metrics that are likely to be stable over time and that are actually correlated with performance. My bias would be to focus on logical reads since most systems are I/O bound but you could use CPU time or elapsed time as well (elapsed time, though, tends not to be particularly stable because it depends on what happens to be in cache at the time the query is run). If you're looking at a plan, focus on the estimated vs. actual row counts not on the cost.

The actual run time of a query is by far the most important metric for tuning queries. We can ignore cost and other metrics 99.9% of the time.
If the query is relatively small and fast, and we can easily re-run it and find the actual run times with the GATHER_PLAN_STATISTICS hint:
-- Add a hint to the query and re-run it.
select /*+ gather_plan_statistics */ count(*) from all_objects;
-- Find the SQL_ID of your query.
select sql_id, sql_fulltext from gv$sql where lower(sql_text) like '%gather_plan_statistics%';
-- Plus in the SQL_ID to find an execution plan with actual numbers.
select * from table(dbms_xplan.display_cursor(sql_id => 'bbqup7krbyf61', format => 'ALLSTATS LAST'));
If the query was very slow, and we can't easily re-run it, generate a SQL Monitor report. This data is usually available for a few hours after the last execution.
-- Generate a SQL Monitor report.
select dbms_sqltune.report_sql_monitor(sql_id => 'bbqup7krbyf61') from dual;
There are whole books written about interpreting the results. The basics are you want to first examine the execution plan and focus on the operations with the largest "A-Time". If you want to understand where the query or optimizer went bad, compare the "E-Rows" with "A-Rows", since the estimated cardinality drives most of the optimizer decisions.
Example output:
SQL_ID bbqup7krbyf61, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from all_objects
Plan hash value: 3058112905
--------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:03.58 | 121K| 622 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:03.58 | 121K| 622 | | | |
|* 2 | FILTER | | 1 | | 79451 |00:00:02.10 | 121K| 622 | | | |
|* 3 | HASH JOIN | | 1 | 85666 | 85668 |00:00:00.12 | 1479 | 2 | 2402K| 2402K| 1639K (0)|
| 4 | INDEX FULL SCAN | I_USER2 | 1 | 148 | 148 |00:00:00.01 | 1 | 0 | | | |
...

As with most things in Engineering, it really comes down to why / what you are comparing and evaluating for.
COST is a general time-based estimate for Oracle that is used as the ranking metric in it's internal optimiser. This answer explains that selection process pretty well.
In general, COST as a metric is a good way to compare the expected computation time of two different queries, since it measures the estimated time cost of the query expressed as # of block reads. So, if you are comparing the performance of the same query, one optimised for time, then COST is a good metric to use.
However, if your query or system is bottle-necked or constraint on something other than time (e.g. memory efficiency), then COST is will be a poor metric to optimise against. In those cases, you should pick a metric that is relevant to your end goal.

Related

Whats the "PARALLEL" equivalent in SQL Server

I have this problem where I need to do a COUNT(COLUMN_NAME) and SUM(COLUMN_NAME) on a few of the tables. The issue is the time it's taking forever on SQL Server to do this.
We have over 2 billion records for which I need to perform these operations.
In Oracle, we can force a parallel execution for a single query/session by using a PARALLEL hint. For example for a simple SELECT COUNT, we can do
SELECT /*+ PARALLEL */ COUNT(1)
FROM USER.TABLE_NAME;
I searched if there is something available for SQL Server and I couldn't comeup with something concrete where I can specify a table hint for a parallel execution. I believe, SQL Server decides for itself whether to do a parallel or sequential execution depending on the query cost.
The same query in Oracle with a parallel hint takes 2-3 mins to perform whereas on SQL Server it takes about an hour and half.
I am reading the article Forcing a Parallel Query Execution Plan . For me it looks like you could for testing purpose force a Parallel execution. The author says in the conclution:
Conclusion
Even experts with decades of SQL Server experience and detailed
internal knowledge will want to be careful with this trace flag. I
cannot recommend you use it directly in production unless advised by
Microsoft, but you might like to use it on a test system as an extreme
last resort, perhaps to generate a plan guide or USE PLAN hint for use
in production (after careful review).
This is an arguably lower risk strategy, but bear in mind that the
parallel plans produced under this trace flag are not guaranteed to be
ones the optimizer would normally consider. If you can improve the
quality of information provided to the optimizer instead to get a
parallel plan, go that way :)
The article is refering to a Trace Flag:
There’s always a Trace Flag
In the meantime, there is a workaround. It’s not perfect (and most
certainly a choice of very last resort) but there is an undocumented
(and unsupported) trace flag that effectively lowers the cost
threshold to zero for a particular query
So as far my understanding of this article you could do something like this:
SELECT
COUNT(1)
FROM
USER.TABLE_NAME
OPTION (RECOMPILE, QUERYTRACEON 8649)
In oracle if do select count() on a column then sql will follow index. In below plan you can see "INDEX FAST FULL SCAN" this will make sql run faster. You can try same in sqlserver, do your table has index. You shall try create index on the column which your counting. But in oracle case it will use any other column index. In below sql has "count(DN)" but it use index of some other column.
SQL> set linesize 500
SQL> set autotrace traceonly
SQL> select count(DN) from My_TOPOLOGY;
Execution Plan
----------------------------------------------------------
Plan hash value: 2512292876
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 164 (64)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| FM_I2_TOPOLOGY | 90850 | 164 (64)| 00:00:01 |
--------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
180 consistent gets
177 physical reads
0 redo size
529 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

Determine Oracle query execution time and proposed datasize without actually executing query

In oracle Is there any way to determine howlong the sql query will take to fetch the entire records and what will be the size of it, Without actually executing and waiting for entire result.
I am getting repeatedly to download and provide the data to the users using normal oracle SQL select (not datapump/import etc) . Some times rows will be in millions.
Actual run time will not known unless you run it, but you can try to estimate it..
first you can do explain plan explain only, this will NOT run query -- based on your current stats it will show you more or less how it will be executed
this will not have actual time and efforts to read the data from datablocks..
do you have large blocksize
is this schema normalized/de-normalized for query/reporting?
how large is row does it fit in same block so only 1 fetch is needed?
of rows you are expecting
based on amount of data * your network latency
Based on this you can try estimate time
This requires good statistics, explain plan for ..., adjusting sys.aux_stats, and then adjusting your expectations.
Good statistics The explain plan estimates are based on optimizer statistics. Make sure that tables and indexes have up-to-date statistics. On 11g this usually means sticking with the default settings and tasks, and only manually gathering statistics after large data loads.
Explain plan for ... Use a statement like this to create and store the explain plan for any SQL statement. This even works for creating indexes and tables.
explain plan set statement_id = 'SOME_UNIQUE_STRING' for
select * from dba_tables cross join dba_tables;
This is usually the best way to visualize an explain plan:
select * from table(dbms_xplan.display);
Plan hash value: 2788227900
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 12M| 5452M| 00:00:19 |
|* 1 | HASH JOIN RIGHT OUTER | | 12M| 5452M| 00:00:19 |
| 2 | TABLE ACCESS FULL | SEG$ | 7116 | 319K| 00:00:01 |
...
The raw data is stored in PLAN_TABLE. The first row of the plan usually sums up the estimates for the other steps:
select cardinality, bytes, time
from plan_table
where statement_id = 'SOME_UNIQUE_STRING'
and id = 0;
CARDINALITY BYTES TIME
12934699 5717136958 19
Adjust sys.aux_stats$ The time estimate is based on system statistics stored in sys.aux_stats. These are numbers for metrics like CPU speed, single-block I/O read time, etc. For example, on my system:
select * from sys.aux_stats$ order by sname
SNAME PNAME PVAL1 PVAL2
SYSSTATS_INFO DSTART 09-11-2014 11:18
SYSSTATS_INFO DSTOP 09-11-2014 11:18
SYSSTATS_INFO FLAGS 1
SYSSTATS_INFO STATUS COMPLETED
SYSSTATS_MAIN CPUSPEED
SYSSTATS_MAIN CPUSPEEDNW 3201.10192837466
SYSSTATS_MAIN IOSEEKTIM 10
SYSSTATS_MAIN IOTFRSPEED 4096
SYSSTATS_MAIN MAXTHR
SYSSTATS_MAIN MBRC
SYSSTATS_MAIN MREADTIM
SYSSTATS_MAIN SLAVETHR
SYSSTATS_MAIN SREADTIM
The numbers can be are automatically gathered by dbms_stats.gather_system_stats. They can also be manually modified. It's a SYS table but relatively safe to modify. Create some sample queries, compare the estimated time with the actual time, and adjust the numbers until they match.
Discover you probably wasted a lot of time
Predicting run time is theoretically impossible to get right in all cases, and in practice it is horribly difficult to forecast for non-trivial queries. Jonathan Lewis wrote a whole book about those predictions, and that book only covers the "basics".
Complex explain plans are typically "good enough" if the estimates are off by one or two orders of magnitude. But that kind of difference is typically not good enough to show to a user, or use for making any important decisions.

Vertica and joins

I'm adapting a web analysis tool to use Vertica as the DB. I'm having real problems optimizing joins. I tried creating pre-join projections for some of my queries, and while it did make the queries blazing fast, it slowed data loading into the fact table to a crawl.
A simple INSERT INTO ... SELECT * FROM which we use to load data into the fact table from a staging table goes from taking ~5 seconds to taking 20+ minutes.
Because of this I dropped all pre-join projections and tried using the Database Designer to design query specific projections but it's not enough. Even with those projections a simple join is taking ~14 seconds, something that takes ~1 second with a pre-join projection.
My question is this: Is it normal for a pre-join projection to slow data insertion this much and if not, what could be the culprit? If it is normal, then it's a show stopper for us and are there other techniques we could use to speed up the joins?
We're running Vertica on a 5 node cluster, each node having 2 x quad core CPU and 32 GB of memory. The tables in my example query have 188,843,085 and 25,712,878 rows respectively.
The EXPLAIN output looks like this:
EXPLAIN SELECT referer_via_.url as referralPageUrl, COUNT(DISTINCT sessio
n.id) as visits FROM owa_session as session JOIN owa_referer AS referer_vi
a_ ON session.referer_id = referer_via_.id WHERE session.yyyymmdd BETWEEN
'20121123' AND '20121123' AND session.site_id = '49' GROUP BY referer_via_
.url ORDER BY visits DESC LIMIT 250;
Access Path:
+-SELECT LIMIT 250 [Cost: 1M, Rows: 250 (STALE STATISTICS)] (PATH ID: 0)
| Output Only: 250 tuples
| Execute on: Query Initiator
| +---> SORT [Cost: 1M, Rows: 1 (STALE STATISTICS)] (PATH ID: 1)
| | Order: count(DISTINCT "session".id) DESC
| | Output Only: 250 tuples
| | Execute on: All Nodes
| | +---> GROUPBY PIPELINED (RESEGMENT GROUPS) [Cost: 1M, Rows: 1 (STALE
STATISTICS)] (PATH ID: 2)
| | | Aggregates: count(DISTINCT "session".id)
| | | Group By: referer_via_.url
| | | Execute on: All Nodes
| | | +---> GROUPBY HASH (SORT OUTPUT) (RESEGMENT GROUPS) [Cost: 1M, Rows
: 1 (STALE STATISTICS)] (PATH ID: 3)
| | | | Group By: referer_via_.url, "session".id
| | | | Execute on: All Nodes
| | | | +---> JOIN HASH [Cost: 1M, Rows: 1 (STALE STATISTICS)] (PATH ID:
4) Outer (RESEGMENT)
| | | | | Join Cond: ("session".referer_id = referer_via_.id)
| | | | | Execute on: All Nodes
| | | | | +-- Outer -> STORAGE ACCESS for session [Cost: 463, Rows: 1 (ST
ALE STATISTICS)] (PUSHED GROUPING) (PATH ID: 5)
| | | | | | Projection: public.owa_session_projection
| | | | | | Materialize: "session".id, "session".referer_id
| | | | | | Filter: ("session".site_id = '49')
| | | | | | Filter: (("session".yyyymmdd >= 20121123) AND ("session"
.yyyymmdd <= 20121123))
| | | | | | Execute on: All Nodes
| | | | | +-- Inner -> STORAGE ACCESS for referer_via_ [Cost: 293K, Rows:
26M] (PATH ID: 6)
| | | | | | Projection: public.owa_referer_DBD_1_seg_Potency_2012112
2_Potency_20121122
| | | | | | Materialize: referer_via_.id, referer_via_.url
| | | | | | Execute on: All Nodes
To speedup join:
Design session table as being partitioned on column "yyyymmdd". This will enable partition pruning
Add condition on column "yyyymmdd" to _referer_via_ and partition on it, if it is possible (most likely not)
have column site_id as possible close to the beginning of order by list in used (super)projection of session
have both tables segmented on referer_id and id correspondingly.
And having more nodes in cluster do help.
My question is this: Is it normal for a pre-join projection to slow data insertion this much and if not, what could be the culprit? If it is normal, then it's a show stopper for us and are there other techniques we could use to speed up the joins?
I guess the amount affected would vary depending on data sets and structures you are working with. But, since this is the variable you changed, I believe it is safe to say the pre-join projection is causing the slowness. You are gaining query time at the expense of insertion time.
Someone please correct me if any of the following is wrong. I'm going by memory and by information picked up with conversations with others.
You can speed up your joins without a pre-join projection a few ways. In this case, the referrer ID. I believe if you segment your projections for both tables with the join predicate that would help. Anything you can do to filter the data.
Looking at your explain plan, you are doing a hash join instead of a merge join, which you probably want to look at.
Lastly, I would like to know via the explain plan or through system tables if your query is actually using the projections Database Designer has recommended. If not, explicitly specify them in your query and see if that helps.
You seem to have a lot of STALE STATISTICS.
Responding to STALE statistics is important. Because that is the reason why your queries are slow. Without statistics about the underlying data, Vertica's query optimizer cannot choose the best execution plan. And responding to STALE statistics only improves SELECT performance not update performance.
If you update your tables regularly do remember there are additional things you have to consider in VERTICA. Please check the answer that I posted to this question.
I hope that should help improve your update speed.
Explore the AHM settings as explained in that answer. If you don't need to be able to select deleted rows in a table later, it is often a good idea to not keep them around. There are ways to keep only the latest epoch version of the data. Or manually purge deleted data.
Let me know how it goes.
I think your query could use some more of being explicit. Also don't use that Devil BETWEEN Try this:
EXPLAIN SELECT
referer_via_.url as referralPageUrl,
COUNT(DISTINCT session.id) as visits
FROM owa_session as session
JOIN owa_referer AS referer_via_
ON session.referer_id = referer_via_.id
WHERE session.yyyymmdd <= '20121123'
AND session.yyyymmdd > '20121123'
AND session.site_id = '49'
GROUP BY referer_via_.url
-- this `visits` column needs a table name
ORDER BY visits DESC LIMIT 250;
I'll say I'm really perplexed as to why you would use the same DATE with BETWEEN may want to look into that.
this is my view coming from an academic background working with column databases, including Vertica (recent PhD graduate in database systems).
Blockquote
My question is this: Is it normal for a pre-join projection to slow data insertion this much and if not, what could be the culprit? If it is normal, then it's a show stopper for us and are there other techniques we could use to speed up the joins?
Blockquote
Yes, updating projections is very slow and you should ideally do it only in large batches to amortize the update cost. The fundamental reason is that each projection represents another copy of the data (of each table column that is part of the projection).
A single row insert requires adding one value (one attribute) to each column in the projection. For example, a single row insert in a table with 20 attributes requires at least 20 column updates. To make things worse, each column is sorted and compressed. This means that inserting the new value in a column requires multiple operations on large chunks of data: read data / decompress / update / sort / compress data / write data back. Vertica has several optimization for updates but cannot hide completely the cost.
Projections can be thought of as the equivalent of multi-column indexes in a traditional row store (MySQL, PostgreSQL, Oracle, etc.). The upside of projections versus traditional B-Tree indexes is that reading them (using them to answer a query) is much faster than using traditional indexes. The reasons are multiple: no need to access head data as for non-clustered indexes, smaller size due to compression, etc. The flipside is that they are way more difficult to update. Tradeoffs...

Oracle <> , != , ^= operators

I want to know the difference of those operators, mainly their performance difference.
I have had a look at Difference between <> and != in SQL, it has no performance related information.
Then I found this on dba-oracle.com,
it suggests that in 10.2 onwards the performance can be quite different.
I wonder why? does != always perform better then <>?
NOTE: Our tests, and performance on the live system shows, changing from <> to != has a big impact on the time the queries return in. I am here to ask WHY this is happening, not whether they are same or not. I know semantically they are, but in reality they are different.
I have tested the performance of the different syntax for the not equal operator in Oracle. I have tried to eliminate all outside influence to the test.
I am using an 11.2.0.3 database. No other sessions are connected and the database was restarted before commencing the tests.
A schema was created with a single table and a sequence for the primary key
CREATE TABLE loadtest.load_test (
id NUMBER NOT NULL,
a VARCHAR2(1) NOT NULL,
n NUMBER(2) NOT NULL,
t TIMESTAMP NOT NULL
);
CREATE SEQUENCE loadtest.load_test_seq
START WITH 0
MINVALUE 0;
The table was indexed to improve the performance of the query.
ALTER TABLE loadtest.load_test
ADD CONSTRAINT pk_load_test
PRIMARY KEY (id)
USING INDEX;
CREATE INDEX loadtest.load_test_i1
ON loadtest.load_test (a, n);
Ten million rows were added to the table using the sequence, SYSDATE for the timestamp and random data via DBMS_RANDOM (A-Z) and (0-99) for the other two fields.
SELECT COUNT(*) FROM load_test;
COUNT(*)
----------
10000000
1 row selected.
The schema was analysed to provide good statistics.
EXEC DBMS_STATS.GATHER_SCHEMA_STATS(ownname => 'LOADTEST', estimate_percent => NULL, cascade => TRUE);
The three simple queries are:-
SELECT a, COUNT(*) FROM load_test WHERE n <> 5 GROUP BY a ORDER BY a;
SELECT a, COUNT(*) FROM load_test WHERE n != 5 GROUP BY a ORDER BY a;
SELECT a, COUNT(*) FROM load_test WHERE n ^= 5 GROUP BY a ORDER BY a;
These are exactly the same with the exception of the syntax for the not equals operator (not just <> and != but also ^= )
First each query is run without collecting the result in order to eliminate the effect of caching.
Next timing and autotrace were switched on to gather both the actual run time of the query and the execution plan.
SET TIMING ON
SET AUTOTRACE TRACE
Now the queries are run in turn. First up is <>
> SELECT a, COUNT(*) FROM load_test WHERE n <> 5 GROUP BY a ORDER BY a;
26 rows selected.
Elapsed: 00:00:02.12
Execution Plan
----------------------------------------------------------
Plan hash value: 2978325580
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 26 | 130 | 6626 (9)| 00:01:20 |
| 1 | SORT GROUP BY | | 26 | 130 | 6626 (9)| 00:01:20 |
|* 2 | INDEX FAST FULL SCAN| LOAD_TEST_I1 | 9898K| 47M| 6132 (2)| 00:01:14 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("N"<>5)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
22376 consistent gets
22353 physical reads
0 redo size
751 bytes sent via SQL*Net to client
459 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
26 rows processed
Next !=
> SELECT a, COUNT(*) FROM load_test WHERE n != 5 GROUP BY a ORDER BY a;
26 rows selected.
Elapsed: 00:00:02.13
Execution Plan
----------------------------------------------------------
Plan hash value: 2978325580
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 26 | 130 | 6626 (9)| 00:01:20 |
| 1 | SORT GROUP BY | | 26 | 130 | 6626 (9)| 00:01:20 |
|* 2 | INDEX FAST FULL SCAN| LOAD_TEST_I1 | 9898K| 47M| 6132 (2)| 00:01:14 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("N"<>5)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
22376 consistent gets
22353 physical reads
0 redo size
751 bytes sent via SQL*Net to client
459 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
26 rows processed
Lastly ^=
> SELECT a, COUNT(*) FROM load_test WHERE n ^= 5 GROUP BY a ORDER BY a;
26 rows selected.
Elapsed: 00:00:02.10
Execution Plan
----------------------------------------------------------
Plan hash value: 2978325580
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 26 | 130 | 6626 (9)| 00:01:20 |
| 1 | SORT GROUP BY | | 26 | 130 | 6626 (9)| 00:01:20 |
|* 2 | INDEX FAST FULL SCAN| LOAD_TEST_I1 | 9898K| 47M| 6132 (2)| 00:01:14 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("N"<>5)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
22376 consistent gets
22353 physical reads
0 redo size
751 bytes sent via SQL*Net to client
459 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
26 rows processed
The execution plan for the three queries is identical and the timings 2.12, 2.13 and 2.10 seconds.
It should be noted that whichever syntax is used in the query the execution plan always displays <>
The tests were repeated ten times for each operator syntax. These are the timings:-
<>
2.09
2.13
2.12
2.10
2.07
2.09
2.10
2.13
2.13
2.10
!=
2.09
2.10
2.12
2.10
2.15
2.10
2.12
2.10
2.10
2.12
^=
2.09
2.16
2.10
2.09
2.07
2.16
2.12
2.12
2.09
2.07
Whilst there is some variance of a few hundredths of the second it is not significant. The results for each of the three syntax choices are the same.
The syntax choices are parsed, optimised and are returned with the same effort in the same time. There is therefore no perceivable benefit from using one over another in this test.
"Ah BC", you say, "in my tests I believe there is a real difference and you can not prove it otherwise".
Yes, I say, that is perfectly true. You have not shown your tests, query, data or results. So I have nothing to say about your results. I have shown that, with all other things being equal, it doesn't matter which syntax you use.
"So why do I see that one is better in my tests?"
Good question. There a several possibilities:-
Your testing is flawed (you did not eliminate outside factors -
other workload, caching etc You have given no information about
which we can make an informed decision)
Your query is a special case (show me the query and we can discuss it).
Your data is a special case (Perhaps - but how - we don't see that either).
There is some other outside influence.
I have shown via a documented and repeatable process that there is no benefit to using one syntax over another. I believe that <> != and ^= are synonymous.
If you believe otherwise fine, so
a) show a documented example that I can try myself
and
b) use the syntax which you think is best. If I am correct and there is no difference it won't matter. If you are correct then cool, you have an improvement for very little work.
"But Burleson said it was better and I trust him more than you, Faroult, Lewis, Kyte and all those other bums."
Did he say it was better? I don't think so. He didn't provide any definitive example, test or result but only linked to someone saying that != was better and then quoted some of their post.
Show don't tell.
You reference the article on the Burleson site. Did you follow the link to the Oracle-L archive? And did you read the other emails replying to the email Burleson cites?
I don't think you did, otherwise you wouldn't have asked this question. Because there is no fundamental difference between != and <>. The original observation was almost certainly a fluke brought about by ambient conditions in the database. Read the responses from Jonathan Lewis and Stephane Faroult to understand more.
" Respect is not something a programmer need to have, its the basic
attitude any human being should have"
Up to a point. When we meet a stranger in the street then of course we should be courteous and treat them with respect.
But if that stranger wants me to design my database application in a specific way to "improve performance" then they should have a convincing explanation and some bulletproof test cases to back it up. An isolated anecdote from some random individual is not enough.
The writer of the article, although a book author and the purveyor of some useful information, does not have a good reputation for accuracy. In this case the article was merely a mention of one persons observations on a well known Oracle mailing list. If you read through the responses you will see the assumptions of the post challenged, but no presumption of accuracy. Here are some excerpts:
Try running your query through explain plan (or autotrace) and see
what that says...
According to this, "!=" is considered to be the same as "<>"...
Jonathan Lewis
Jonathan Lewis is a well respected expert in the Oracle community.
Just out of curiosity... Does the query optimizer generate a different
execution plan for the two queries? Regards, Chris
.
Might it be bind variable peeking in action? The certain effect of
writing != instead of <> is to force a re-parse. If at the first
execution the values for :id were different and if you have an
histogram on claws_doc_id it could be a reason. And if you tell me
that claws_doc_id is the primary key, then I'll ask you what is the
purpose of counting, in particular when the query in the EXISTS clause
is uncorrelated with the outer query and will return the same result
whatever :id is. Looks like a polling query. The code surrounding it
must be interesting.
Stéphane Faroult
.
I'm pretty sure the lexical parse converts either != to <> or <> to
!=, but I'm not sure whether that affects whether the sql text will
match a stored outline.
.
Do the explain plans look the same? Same costs?
The following response is from the original poster.
Jonathan, Thank you for your answer. We did do an explain plan on
both versions of the statement and they were identical, which is what
is so puzzling about this. According to the documentation, the two
forms of not equal are the same (along with ^= and one other that I
can't type), so it makes no sense to me why there is any difference in
performance.
Scott Canaan
.
Not an all inclusive little test but it appears at least in 10.1.0.2
it gets pared into a "<>" for either (notice the filter line for each
plan)
.
Do you have any Stored Outline ? Stored Outlines do exact (literal)
matches so if you have one Stored Outline for, say, the SQL with a
"!=" and none for the SQL with a "<>" (or a vice versa), the Stored
Outline might be using hints ? (although, come to think of it, your
EXPLAIN PLAN should have shown the hints if executing a Stored Outline
?)
.
Have you tried going beyond just explain & autotrace and running a
full 10046 level 12 trace to see where the slower version is spending
its time? This might shed some light on the subject, plus - be sure
to verify that the explain plans are exactly the same in the 10046
trace file (not the ones generated with the EXPLAIN= option), and in
v$sqlplan. There are some "features" of autotrace and explain that
can cause it to not give you an accurate explain plan.
Regards, Brandon
.
Is the phenomenon totally reproducible ?
Did you check the filter_predicates and access_predicates of the plan,
or just the structure. I don't expect any difference, but a change in
predicate order can result in a significant change in CPU usage if you
are unlucky.
If there is no difference there, then enable rowsource statistics
(alter session set "_rowsource_execution_statistics"=true) and run the
queries, then grab the execution plan from V$sql_plan and join to
v$sql_plan_statistics to see if any of the figures about last_starts,
last_XXX_buffer_gets, last_disk_reads, last_elapsed_time give you a
clue about where the time went.
If you are on 10gR2 there is a /*+ gather_plan_statistics */ hint you
can use instead of the "alter session".
Regards Jonathan Lewis
At this point the thread dies and we see no further posts from the original poster, which leads me to believe that either the OP discovered an assumption they had made that was not true or did no further investigation.
I will also point out that if you do an explain plan or autotrace, you will see that the comparison is always displayed as <>.
Here is some test code. Increase the number of loop iterations if you like. You may see one side or the other get a higher number depending on the other activity on the server activity, but in no way will you see one operator come out consistently better than the other.
DROP TABLE t1;
DROP TABLE t2;
CREATE TABLE t1 AS (SELECT level c1 FROM dual CONNECT BY level <=144000);
CREATE TABLE t2 AS (SELECT level c1 FROM dual CONNECT BY level <=144000);
SET SERVEROUTPUT ON FORMAT WRAPPED
DECLARE
vStart Date;
vTotalA Number(10) := 0;
vTotalB Number(10) := 0;
vResult Number(10);
BEGIN
For vLoop In 1..10 Loop
vStart := sysdate;
For vLoop2 In 1..2000 Loop
SELECT count(*) INTO vResult FROM t1 WHERE t1.c1 = 777 AND EXISTS
(SELECT 1 FROM t2 WHERE t2.c1 <> 0);
End Loop;
vTotalA := vTotalA + ((sysdate - vStart)*24*60*60);
vStart := sysdate;
For vLoop2 In 1..2000 Loop
SELECT count(*) INTO vResult FROM t1 WHERE t1.c1 = 777 AND EXISTS
(SELECT 1 FROM t2 WHERE t2.c1 != 0);
End Loop;
vTotalB := vTotalB + ((sysdate - vStart)*24*60*60);
DBMS_Output.Put_Line('Total <>: ' || RPAD(vTotalA,8) || '!=: ' || vTotalB);
vTotalA := 0;
vTotalB := 0;
End Loop;
END;
A Programmer will use !=
A DBA will use <>
If there is a different execution plan it may be that there are differences in the query cache or statistics for each notation. But I don't really think it is so.
Edit:
What I mean above. In complex databases there can be some strange side effects. I don't know oracle good enough, but I think there is an Query Compilation Cache like in SQL Server 2008 R2.
If a query is compiled as new query, the database optimiser calculates a new execution plan depending on the current statistics. If the statistics has changed it will result in a other, may be a worse plan.

Optimal MySQL temporary tables (memory tables) configuration?

First of all, I am new to optimizing mysql. The fact is that I have in my web application (around 400 queries per second), a query that uses a GROUP BY that i can´t avoid and that is the cause of creating temporary tables. My configuration was:
max_heap_table_size = 16M
tmp_table_size = 32M
The result: temp table to disk percent + - 12.5%
Then I changed my settings, according to this post
max_heap_table_size = 128M
tmp_table_size = 128M
The result: temp table to disk percent + - 18%
The results were not expected, do not understand why.
It is wrong tmp_table_size = max_heap_table_size?
Should not increase the size?
Query
SELECT images, id
FROM classifieds_ads
WHERE parent_category = '1' AND published='1' AND outdated='0'
GROUP BY aux_order
ORDER BY date_lastmodified DESC
LIMIT 0, 100;
EXPLAIN
| 1 |SIMPLE|classifieds_ads | ref |parent_category, published, combined_parent_oudated_published, oudated | combined_parent_oudated_published | 7 | const,const,const | 67552 | Using where; Using temporary; Using filesort |
"Using temporary" in the EXPLAIN report does not tell us that the temp table was on disk. It only tells us that the query expects to create a temp table.
The temp table will stay in memory if its size is less than tmp_table_size and less than max_heap_table_size.
Max_heap_table_size is the largest a table can be in the MEMORY storage engine, whether that table is a temp table or non-temp table.
Tmp_table_size is the largest a table can be in memory when it is created automatically by a query. But this can't be larger than max_heap_table_size anyway. So there's no benefit to setting tmp_table_size greater than max_heap_table_size. It's common to set these two config variables to the same value.
You can monitor how many temp tables were created, and how many on disk like this:
mysql> show global status like 'Created%';
+-------------------------+-------+
| Variable_name | Value |
+-------------------------+-------+
| Created_tmp_disk_tables | 20 |
| Created_tmp_files | 6 |
| Created_tmp_tables | 43 |
+-------------------------+-------+
Note in this example, 43 temp tables were created, but only 20 of those were on disk.
When you increase the limits of tmp_table_size and max_heap_table_size, you allow larger temp tables to exist in memory.
You may ask, how large do you need to make it? You don't necessarily need to make it large enough for every single temp table to fit in memory. You might want 95% of your temp tables to fit in memory and only the remaining rare tables go on disk. Those last 5% might be very large -- a lot larger than the amount of memory you want to use for that.
So my practice is to increase tmp_table_size and max_heap_table_size conservatively. Then watch the ratio of Created_tmp_disk_tables to Created_tmp_tables to see if I have met my goal of making 95% of them stay in memory (or whatever ratio I want to see).
Unfortunately, MySQL doesn't have a good way to tell you exactly how large the temp tables were. That will vary per query, so the status variables can't show that, they can only show you a count of how many times it has occurred. And EXPLAIN doesn't actually execute the query so it can't predict exactly how much data it will match.
An alternative is Percona Server, which is a distribution of MySQL with improvements. One of these is to log extra information in the slow-query log. Included in the extra fields is the size of any temp tables created by a given query.