I’m new to pl sql can you please let me know how i can optimize the below if statement?
IF (inSeries=‘90’) OR (inSeries=‘91’) OR (inSeries=‘92’) OR (inSeries=‘93’) OR (inSeries=‘94’) THEN
like in sql we can use
WHERE inSeries IN (‘90’,’91’,’92’,’93’,’94’)
In PLSQL also 'IN' condition works as IF condition
declare
inSeries varchar2(2) := '90';
begin
if inseries in ('90','91','92','93','94')
then
dbms_output.put_line(inseries ||':this is within series');
else
dbms_output.put_line(inseries ||':this is out of series');
end if;
end;
-- output
90:this is within series
80:this is out of series
but there is another way depending on the business logic, as i can see from your question that its in series increment, you can directly use greater than and less than combination...
Optimizer will most probably rewrite query so your IN will become OR anyway. Compare line 3 in the query and the very last line:
SQL> select job
2 from emp
3 where deptno in (10, 20, 30, 40);
JOB
---------
CLERK
SALESMAN
<snip>
14 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3956160932
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 154 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| EMP | 14 | 154 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("DEPTNO"=10 OR "DEPTNO"=20 OR "DEPTNO"=30 OR "DEPTNO"=40)
SQL>
You can use your SQL query itself along with 'EXISTS' keyword.
IF EXISTS (SELECT * FROM <table_name> WHERE inSeries IN (‘90’,’91’,’92’,’93’,’94’))
Related
I am wondering if there is a difference between the two queries below.
I am looking for a general answer to explain how the optimizer treats each of these answers. There is an index on t.id.
The version of Oracle is 11g.
select t.id, sum(t.amount)
from transaction t
group by t.id
having sum(t.amount) between -0.009 and 0.009
select t.id, sum(t.amount)
from transaction t
group by t.id
having sum(t.amount) >= -0.009 and sum(t.amount)<= 0.009
In an aggregation query, most of the work involves moving the data around. There is some overhead for aggregations, but it is usually pretty simple.
And, the SQL compiler can decide if it wants to re-use aggregated expressions. Just because you use sum(amount) twice in the query doesn't mean that it gets executed twice.
Some aggregation functions are more expensive -- especially on strings or using distinct. You can always test queries to see if there is much impact, but in general, you should worry about whether your logic is correct not how many times you are using aggregation functions.
If you want to obseve basic information about the steps decided by the CBO for the execution of SQL statement use explain plan
Example
EXPLAIN PLAN SET STATEMENT_ID = 'jara1' into plan_table FOR
select DEPARTMENT_ID, sum(salary)
from HR.employees
group by DEPARTMENT_ID
having sum(salary) between 5000 and 10000
;
--
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL'));
The query returns
Plan hash value: 244580604
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 4 (25)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | HASH GROUP BY | | 1 | 7 | 4 (25)| 00:00:01 |
| 3 | TABLE ACCESS FULL| EMPLOYEES | 107 | 749 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
3 - SEL$1 / EMPLOYEES#SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(SUM("SALARY")>=5000 AND SUM("SALARY")<=10000)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (rowset=256) "DEPARTMENT_ID"[NUMBER,22], SUM("SALARY")[22]
2 - (#keys=1; rowset=256) "DEPARTMENT_ID"[NUMBER,22],
SUM("SALARY")[22]
3 - (rowset=256) "SALARY"[NUMBER,22], "DEPARTMENT_ID"[NUMBER,22]
So first of all you see a TABLE ACCESS FULL is performed (line 3), so your index assumption is not correct.
As pointed in other answer, you see the between is translated in two perdicates connected with and (filter line 1).
But most impertant fro yur question is the Column Projection, you see that the sum(SALARY) is calculated in line 2 (HASH GROUP BY operation) and passed to the line 1 (FILTER), in both cases only once (one column with length 22).
So don't worry about multiple calculation.
There is absolutely no difference between the two queries. between is just syntactical sugar; the parser immediately transforms the between condition into the two inequalities, combined with the and operator. This is done even before the optimizer sees the query. (Note that in this context the distinction between the parsing and the optimization stages is meaningful, even though often programmers think of them as a single step.)
Trivial example:
SQL> set autotrace traceonly explain
SQL> select deptno, sum(sal) as sum_sal
2 from scott.emp
3 group by deptno
4 having sum(sal) between 10000 and 20000
5 ;
Execution Plan
----------------------------------------------------------
Plan hash value: 2138686577
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 4 (25)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | HASH GROUP BY | | 1 | 7 | 4 (25)| 00:00:01 |
| 3 | TABLE ACCESS FULL| EMP | 14 | 98 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(SUM("SAL")>=10000 AND SUM("SAL")<=20000)
The "index on..." thing that you mention has nothing to do with the question.
Another fun way to test this:
with function expand_sql_text(text_in varchar2)
return varchar2
as
text_out long;
begin
dbms_utility.expand_sql_text(text_in, text_out);
return text_out;
end expand_sql_text;
select expand_sql_text(
'select * from dual where 2 between 1 and 3'
) as text_out
from dual
/
TEXT_OUT
------------------------------------------------------------------------------------------------------------------------------------------------------------
SELECT "A1"."DUMMY" "DUMMY" FROM "SYS"."DUAL" "A1" WHERE 2>=1 AND 2<=3
1 row selected.
In your original question, the second predicate was
having sum(t.amount) > -0.009 and sum(t.amount)< 0.009
which is not the same as the between version, because between is not exclusive.
In SQL generally, filter predicates against simple literals do not normally lead to any significant performance overhead. In a group by clause, the fact that the predicate is applied after aggregation reduces any overhead even further.
When there is a correlated query, what is the sequence of execution?
Ex:
select
p.productNo,
(
select count(distinct concat(bom.detailpart,bom.groupname))
from dl_MBOM bom
where bom.DetailPart=p.ProductNo
) cnt1
from dm_product p
The execution plan will vary by database vendors. For Oracle, here is a similar query, and the corresponding execution plan.
select dname,
( select count( distinct job )
from emp e
where e.deptno = d.deptno
) x
from dept d
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 6 (100)| |
| 1 | SORT GROUP BY | | 1 | 11 | | |
|* 2 | TABLE ACCESS FULL| EMP | 5 | 55 | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL | DEPT | 4 | 52 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("E"."DEPTNO"=:B1)
While it seems likely that the DBMS reads record for record from dm_product and for each such record looks up the value in dl_MBOM, this doesn't necessarily happen.
With an SQL query you tell the DBMS mainly what to do, not how to do it. If the DBMS thinks it better to build a join instead and work on this, it is free to do so.
Short answer: the sequence of execution is not determined. (You can, however, in many DBMS look at the query's execution plan to see how it is executed.)
I just wanted to know the best way I can use in Oracle query, to avoid updating a field , if that is unchanged?
Update xtab1 set xfield1='xxx' where xkey='123';
In performance aspect what is best way , with which this update should not be invoked , if the existing value of xfield1 is 'xxx' .
Option1 :
step1:Invoke a SELECT to Fetch the value of xfield1
step2:If the above value is not 'xxx', then only invoke UPDATE
Option2 :
Invoke update as below:
Update xtab1 set xfield1='xxx' where xkey='123' and xfield1 <> 'xxx'
Please let me know which of the above 2 is best and ideal way, or is there any other ideal approach to be used?
Appreciate your help
Update xtab1 set xfield1='xxx' where xkey='123' and xfield1 <> 'xxx'
The filter predicate is applied before doing the update. So, I would go with option 2 and let Oracle do the job for you rather than doing it manually to first filter out the rows. Also, it would be an overhead to do it in two different steps. The filtering of rows should be a part of the same step.
Regarding the performance, I think indexes would play an important role.
You can test it and see:
Without index
Option 1
SQL> EXPLAIN PLAN FOR
2 UPDATE t SET sal = 9999 WHERE deptno = 20;
Explained.
SQL>
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 931696821
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 5 | 35 | 3 (0)| 00:00:01 |
| 1 | UPDATE | T | | | | |
|* 2 | TABLE ACCESS FULL| T | 5 | 35 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
---------------------------------------------------
2 - filter("DEPTNO"=20)
14 rows selected.
SQL>
Option 2
SQL> EXPLAIN PLAN FOR
2 UPDATE t SET sal = 9999 WHERE deptno = 20 AND sal<>9999;
Explained.
SQL>
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 931696821
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 4 | 28 | 3 (0)| 00:00:01 |
| 1 | UPDATE | T | | | | |
|* 2 | TABLE ACCESS FULL| T | 4 | 28 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
---------------------------------------------------
2 - filter("DEPTNO"=20 AND "SAL"<>9999)
14 rows selected.
With Index
SQL> CREATE INDEX t_idx ON t(deptno,sal);
Index created.
Option 1
SQL> EXPLAIN PLAN FOR
2 UPDATE t SET sal = 9999 WHERE deptno = 20;
Explained.
SQL>
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1175576152
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 5 | 35 | 1 (0)| 00:00:01 |
| 1 | UPDATE | T | | | | |
|* 2 | INDEX RANGE SCAN| T_IDX | 5 | 35 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
---------------------------------------------------
2 - access("DEPTNO"=20)
14 rows selected.
SQL>
Option 2
SQL> EXPLAIN PLAN FOR
2 UPDATE t SET sal = 9999 WHERE deptno = 20 AND sal<>9999;
Explained.
SQL>
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1175576152
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 4 | 28 | 1 (0)| 00:00:01 |
| 1 | UPDATE | T | | | | |
|* 2 | INDEX RANGE SCAN| T_IDX | 4 | 28 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
---------------------------------------------------
2 - access("DEPTNO"=20)
filter("SAL"<>9999)
15 rows selected.
SQL>
So, in option 2 in all the cases, the filter("SAL"<>9999) is applied.
I don't think there will be significant performance difference between the two options, as both will require looking up rows and performing comparison of the values. And I doubt other options such as pre-update triggers will yield better performance than your option 2.
If you really wanted to know how the Oracle optimizer handles your queries, try the EXPLAIN PLAN statement. For example, to see the plan that the Oracle optimizer formulated to execute your second option, try this:
EXPLAIN PLAN FOR
UPDATE xtab1 SET xfield1='xxx'
WHERE xkey='123' AND xfield1 <> 'xxx'
There is more information about what the different columns of a EXPLAIN PLAN result means in this SO post.
Now, if you are dealing with large number of transactions, I recommend consider other options such as comparing the values at the application level, so as to avoid expensive database I/Os all together where possible :-) or use some form of ETL tools that are optimized to handle large transactions.
Where would you fetch the value? In some application?
I don't think there would be much difference between the two for smaller queries. For more complex ones, I would suggest to go with second choice to make Oracle optimize the query for you for best results.
Thank you all.
I have chosen to go with the Option 2, even my DBA agrees to that as the better approach.
I'm wondering why cost of this query
select * from address a
left join name n on n.adress_id=a.id
where a.street='01';
is higher than
select * from address a
left join name n on n.adress_id=a.id
where a.street=N'01';
where address table looks like this
ID NUMBER
STREET VARCHAR2(255 CHAR)
POSTAL_CODE VARCHAR2(255 CHAR)
and name table looks like this
ID NUMBER
ADDRESS_ID NUMBER
NAME VARCHAR2(255 CHAR)
SURNAME VARCHAR2(255 CHAR)
These are costs returned by explain plan
Explain plan for '01'
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3591 | 1595K| 87 (0)| 00:00:02 |
| 1 | NESTED LOOPS OUTER | | 3591 | 1595K| 87 (0)| 00:00:02 |
|* 2 | TABLE ACCESS FULL | ADDRESS | 3 | 207 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| NAME | 1157 | 436K| 47 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | NAME_HSI | 1157 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"."STREET"='01')
4 - access("N"."ADDRESS_ID"(+)="A"."ID")
Explain plan for N'01'
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 347 | 154K| 50 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 347 | 154K| 50 (0)| 00:00:01 |
|* 2 | TABLE ACCESS FULL | ADDRESS | 1 | 69 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| NAME | 1157 | 436K| 47 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | NAME_HSI | 1157 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(SYS_OP_C2C("A"."STREET")=U'01')
4 - access("N"."ADDRESS_ID"(+)="A"."ID")
As you can see cost for N'01' query is lower than cost for '01'. Any idea why? N'01' needs additionally convert varchar to nvarchar so cost should be higher (SYS_OP_C2C()). The other question is why rows processed by N'01' query is lower than '01'?
[EDIT]
Table address has 30 rows.
Table name has 19669 rows.
SYS_OP_C2C is an internal function which does an implicit conversion of varchar2 to national character set using TO_NCHAR function. Thus, the filter completely changes as compared to the filter using normal comparison.
I am not sure about the reason why the number of rows are less, but I can guarantee it could be more too. Cost estimation won't be affected.
Let's try to see step-by-step in a test case.
SQL> CREATE TABLE t AS SELECT 'a'||LEVEL col FROM dual CONNECT BY LEVEL < 1000;
Table created.
SQL>
SQL> EXPLAIN PLAN FOR SELECT * FROM t WHERE col = 'a10';
Explained.
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 5 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 1 | 5 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
1 - filter("COL"='a10')
13 rows selected.
SQL>
So far so good. Since there is only one row with value as 'a10', optimizer estimated one row.
Let's see with the national characterset conversion.
SQL> EXPLAIN PLAN FOR SELECT * FROM t WHERE col = N'a10';
Explained.
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 50 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 10 | 50 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
1 - filter(SYS_OP_C2C("COL")=U'a10')
13 rows selected.
SQL>
What happened here? We can see filter(SYS_OP_C2C("COL")=U'a10'), which means an internal function is applied and it converts the varchar2 value to nvarchar2. The filter now found 10 rows.
This will also suppress any index usage, since now a function is applied on the column. We can tune it by creating a function-based index to avoid full table scan.
SQL> create index nchar_indx on t(to_nchar(col));
Index created.
SQL>
SQL> EXPLAIN PLAN FOR SELECT * FROM t WHERE to_nchar(col) = N'a10';
Explained.
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 1400144832
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 50 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 10 | 50 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | NCHAR_INDX | 4 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
---------------------------------------------------
2 - access(SYS_OP_C2C("COL")=U'a10')
14 rows selected.
SQL>
However, will this make the execution plans similar? No. i think with two different charactersets , the filter will not be applied alike. Thus, the difference lies.
My research says,
Usually, such scenarios occur when the data coming via an application
is nvarchar2 type, but the table column is varchar2. Thus, Oracle
applies an internal function in the filter operation. My suggestion
is, to know your data well, so that you use similar data types during
design phase.
When worrying about explain plans, it matters whether there are current statistics on the tables. If the statistics do not represent the actual data reasonably well, then the optimizer will make mistakes and estimate cardinalities incorrectly.
You can check how long ago statistics were gathered by querying the data dictionary:
select table_name, last_analyzed
from user_tables
where table_name in ('ADDRESS','NAME');
You can gather statistics for the optimizer to use by calling DBMS_STATS:
begin
dbms_stats.gather_table_stats(user, 'ADDRESS');
dbms_stats.gather_table_stats(user, 'NAME');
end;
So perhaps after gathering statistics you will get different explain plans. Perhaps not.
The difference in your explain plans is primarily because the optimizer estimates how many rows it will find in address table differently in the two cases.
In the first case you have an equality predicate with same datatype - this is good and the optimizer can often estimate cardinality (row count) reasonably well for cases like this.
In the second case a function is applied to the column - this is often bad (unless you have function based indexes) and will force the optimizer to take a wild guess. That wild quess will be different in different versions of Oracle as the developers of the optimizer tries to improve upon it. Some versions the wild guess will simply be something like "I guess 5% of the number of rows in the table."
When comparing different datatypes, it is best to avoid implicit conversions, particularly when like this case the implicit conversion makes a function on the column rather than the literal. If you have cases where you get a value as datatype NVARCHAR2 and need to use it in a predicate like above, it can be a good idea to explicitly convert the value to the datatype of the column.
select * from address a
left join name n on n.adress_id=a.id
where a.street = CAST( N'01' AS VARCHAR2(255));
In this case with a literal it does not make sense, of course. Here you would just use your first query. But if it was a variable or function parameter, maybe you could have use cases for doing something like this.
As I can see the first query returns 3591 rows, the second one returns 347 rows. So Oracle needs less I/O operation that's why the cost is less.
Don't be confused with
N'01' needs additionally convert varchar to nvarchar
Oracle does one hard parse and then uses soft parse for the same queries. So the longer your oracle works the faster it becomes.
When we execute any sql statement in Oracle, a hash value is being assigned to that sql statement and stored into the library cache. So, that later, if another user request the same query, then Oracle find the hash value and execute the same execution plan. But, I have one doubt about the hash value. I mean, how hash value gets generated ?, I mean, whether Oracle server uses some algorithms or they just convert the sql string into some numeric value.
Since, I was reading Pro Oracle SQL book, on which it is written that,
select * from employees where department_id = 60;
SELECT * FROM EMPLOYEES WHERE DEPARTMENT_ID = 60;
select /* a_comment */ * from employees where department_id = 60;
will return different hash value, because when sql statement executed, then Oracle first converts the string to a hash value. But, when i tried this, then it return same hash value.
SQL> select * from boats where bid=10;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 2799518614
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 16 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| BOATS | 1 | 16 | 1 (0)| 00:00:01 |
|* 2 | INDEX UNIQUE SCAN | B_PK | 1 | | 0 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("BID"=10)
SQL> SELECT * FROM BOATS WHERE BID=10;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 2799518614
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 16 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| BOATS | 1 | 16 | 1 (0)| 00:00:01 |
|* 2 | INDEX UNIQUE SCAN | B_PK | 1 | | 0 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("BID"=10)
In the text of your question, you appear to be describing the sql_id and/or the hash_value. This is the hash of the text of the SQL statement and is what Oracle uses to determine whether a particular SQL statement already exists in the shared pool. What you are showing in your example, however, is the plan_hash_value which is the hash of the plan that is generated for the SQL statement. There is, potentially, a many-to-many relationship between the two. A single SQL statement (sql_id/ hash_value) can have multiple different plans (plan_hash_value) and multiple different SQL statements can share the same plan.
So, for example, if I write two different SQL statements that are querying a particular row from the EMP table, I'll get the same plan_hash_value.
SQL> set autotrace traceonly;
SQL> select * from emp where ename = 'BOB';
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 3956160932
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 39 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| EMP | 1 | 39 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ENAME"='BOB')
SQL> ed
Wrote file afiedt.buf
1* select * FROM emp WHERE ename = 'BOB'
SQL> /
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 3956160932
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 39 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| EMP | 1 | 39 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ENAME"='BOB')
If I look in v$sql, however, I'll see that two different sql_id and hash_value values were generated
SQL> set autotrace off;
SQL> ed
Wrote file afiedt.buf
1 select sql_id, sql_text, hash_value, plan_hash_value
2 from v$sql
3 where sql_text like 'select%BOB%'
4* and length(sql_text) < 50
SQL> /
SQL_ID SQL_TEXT HASH_VALUE PLAN_HASH_VALUE
------------- ---------------------------------------- ---------- ---------------
161v96c0v9c0n select * FROM emp WHERE ename = 'BOB' 28618772 3956160932
cvs1krtgzfr78 select * from emp where ename = 'BOB' 1610046696 3956160932
Oracle recognizes that these two statements are different queries with different sql_id and hash_value hashes. But they both happen to generate the same plan so they end up with the same plan_hash_value.
I would say that you just proved that the book is wrong in this case. And theoretically it seems better to have the hash indentify the conceptual SQL statement instead of a randomly-capitalized string... And i hope the comments get ignored too when generating the hash. ;-)
set lines 300
col BEGIN_INTERVAL_TIME for a30
select a.snap_id, a.begin_interval_time, b.plan_hash_value from dba_hist_snapshot a, dba_hist_sqlstat b where a.snap_id=b.snap_id and b.sql_id='&sql_id' order by 1;