Sunday, July 26, 2009

PLAN_HASH_VALUE - How equal (and stable?) are your execution plans - part 2

Back to part 1

In the previous post I've shown some characteristics of the PLAN_HASH_VALUE information provided by Oracle.

Now if you want to have greater control over how a hash value on the execution plan should be calculated, in particular regarding some attributes of the PLAN_TABLE column that are not used to calculate the provided PLAN_HASH_VALUE, e.g. the filter and access predicates, here are some ideas how to do it yourself.

Obviously we need to calculate a hash value across multiple columns and multiple rows from a given PLAN_TABLE-equivalent source, which could be e.g. V$SQL_PLAN or a real PLAN_TABLE. This requires the following:

- A function calculating a hash value from a given input
- The input needs to represent the columns and rows from the PLAN_TABLE
- Therefore ideally we need to combine the columns to a single expression
- And finally combine the expressions from the different rows again to a single expression that can be fed into the hash value function.

In Oracle 10.2 and later all these things can be done with built-in functions. In previous releases some more work and user-defined functions are required, but it's definitely feasible without too much effort.

A function calculating a hash value from a given input

Apart from any user-defined functions that calculate a hash value there are multiple options available provided by Oracle as built-in functions.

In pre-10.2 versions there are only a few built-in functions that calculate a hash value.

One is DBMS_UTILITY.GET_HASH_VALUE which has already been there since pre-9i releases (at least 8i, but likely even earlier). The other one is DBMS_OBFUSCATION_TOOLKIT.MD5 which has been introduced in 9i.

The main difference is that the MD5 function - the name suggests it - calculates a hash value according to the standard MD5 algorithm whereas GET_HASH_VALUE simply returns a 31bit hash value calculated.

Version 10 adds some more functions, in particular the built-in ORA_HASH function and the DBMS_CRYPTO package that basically supersedes the DBMS_OBFUSCATION_TOOLKIT with enhanced functionality.

These functions differ in some details but basically can all be used for the given task.

Here I'm just highlighting some noticeable variations:

The DBMS_OBFUSCATION_TOOLKIT.MD5 is not very well documented and quite cumbersome to use. It cannot be used directly from SQL (you get always "ORA-06553: PLS-307: too many declarations of 'MD5' match this call), and it throws an exception when passing a NULL value. Therefore it's probably best to be wrapped by a user-defined function, e.g. something like this:


create or replace function hashkey(in_string in varchar2) return varchar2 as
begin
if in_string is null then
return to_char(null);
else
return rawtohex(utl_raw.cast_to_raw(sys.dbms_obfuscation_toolkit.md5(input_string=>in_string)));
end if;
end;
/


Note that this version of MD5 uses a string as input but seems to return a raw value, but VARCHAR2 as data type. Therefore some cumbersome conversions are required (cast to raw and then raw to hex) to get actual VARCHAR2 output.

The DBMS_CRYPTO package offers some overloaded versions of the HASH function. It's interesting to note that DBMS_CRYPTO.HASH offers the capability to process LOBs as input to calculate the hash value.

The most interesting function due to its simplicity and versatility is the ORA_HASH function. One special feature is that it is capable of taking nested tables as input in addition to the normal built-in scalar data types.

Here's a simple performance comparison of the different functions (performed on 11.1.0.7 Win32):


SQL>
SQL> drop function hashkey;

Function dropped.

Elapsed: 00:00:00.04
SQL>
SQL> create or replace function hashkey(in_string in varchar2) return varchar2 as
2 begin
3 if in_string is null then
4 return to_char(null);
5 else
6 return rawtohex(utl_raw.cast_to_raw(sys.dbms_obfuscation_toolkit.md5(input_string=>in_string)));
7 end if;
8 end;
9 /

Function created.

Elapsed: 00:00:00.04

SQL> drop table random_data purge;

Table dropped.

Elapsed: 00:00:00.04
SQL>
SQL> exec dbms_random.seed(0)

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01
SQL>
SQL> create table random_data
2 as
3 select
4 dbms_random.string('A', 100) as the_data
5 from
6 dual
7 connect by
8 level <= 100000;

Table created.

Elapsed: 00:00:31.30
SQL>
SQL> insert /*+ append */ into random_data
2 select * from random_data;

100000 rows created.

Elapsed: 00:00:01.17
SQL>
SQL> commit;

Commit complete.

Elapsed: 00:00:00.01
SQL>
SQL> insert /*+ append */ into random_data
2 select * from random_data;

200000 rows created.

Elapsed: 00:00:01.06
SQL>
SQL> commit;

Commit complete.

Elapsed: 00:00:00.01
SQL>
SQL> select count(the_data)
2 from (
3 select
4 the_data
5 from
6 random_data
7 );

COUNT(THE_DATA)
---------------
400000

Elapsed: 00:00:01.66
SQL>
SQL> select count(the_data)
2 from (
3 select
4 the_data
5 from
6 random_data
7 );

COUNT(THE_DATA)
---------------
400000

Elapsed: 00:00:01.14
SQL>
SQL> select max(the_hash)
2 from (
3 select
4 ora_hash(the_data) as the_hash
5 from
6 random_data
7 );

MAX(THE_HASH)
-------------
4294934713

Elapsed: 00:00:02.09
SQL>
SQL> select max(the_hash)
2 from (
3 select
4 ora_hash(the_data) as the_hash
5 from
6 random_data
7 );

MAX(THE_HASH)
-------------
4294934713

Elapsed: 00:00:01.54
SQL>
SQL> select max(the_hash)
2 from (
3 select
4 dbms_utility.get_hash_value(the_data, 0, 2147483647) as the_hash
5 from
6 random_data
7 );

MAX(THE_HASH)
-------------
2147451066

Elapsed: 00:00:04.48
SQL>
SQL> select max(the_hash)
2 from (
3 select
4 dbms_utility.get_hash_value(the_data, 0, 2147483647) as the_hash
5 from
6 random_data
7 );

MAX(THE_HASH)
-------------
2147451066

Elapsed: 00:00:04.08
SQL>
SQL> select max(the_hash)
2 from (
3 select
4 hashkey(the_data) as the_hash
5 from
6 random_data
7 );

MAX(THE_HASH)
--------------------------------------------------------------------------------
FFFF54FE7872DBDD4E870FFC20C1DDE0

Elapsed: 00:00:15.23
SQL>
SQL> select max(the_hash)
2 from (
3 select
4 hashkey(the_data) as the_hash
5 from
6 random_data
7 );

MAX(THE_HASH)
--------------------------------------------------------------------------------
FFFF54FE7872DBDD4E870FFC20C1DDE0

Elapsed: 00:00:14.19
SQL>
SQL> select max(the_hash)
2 from (
3 select
4 dbms_crypto.hash(utl_raw.cast_to_raw(the_data), 2) as the_hash
5 from
6 random_data
7 );

MAX(THE_HASH)
--------------------------------------------------------------------------------
FFFF54FE7872DBDD4E870FFC20C1DDE0

Elapsed: 00:00:14.42
SQL>
SQL> select max(the_hash)
2 from (
3 select
4 dbms_crypto.hash(utl_raw.cast_to_raw(the_data), 2) as the_hash
5 from
6 random_data
7 );

MAX(THE_HASH)
--------------------------------------------------------------------------------
FFFF54FE7872DBDD4E870FFC20C1DDE0

Elapsed: 00:00:13.66
SQL>


So it's obvious that the ORA_HASH function has the least overhead (no PL/SQL context switch required), and the standard MD5 based functions do have the most overhead, but the calculated hash values are 128bit hashes and therefore are much more robust regarding hash collisions, but that doesn't really matter in our case here.

For our particular case here the DBMS_CRYPTO.HASH and the ORA_HASH function are best suited because they are capable of handling large concatenations. In case of DBMS_CRYPTO.HASH this is done via support for CLOBs, in case of ORA_HASH it's the support for collections.

The input needs to represent the columns and rows from the PLAN_TABLE

Concatenating the columns of interest to build a single expression from a PLAN_TABLE row is straightforward, however it poses the potential issue that some of the attributes that are likely to be included are of VARCHAR2(4000) type, so a simple VARCHAR2 based concatenation is not possible. You either need to use a CLOB instead or somehow shorten the expression. In order to avoid costly temporary LOB concatenation operations I prefer the latter and use again a hash function on the large columns to shorten the expression significantly, allowing to represent the whole row expression in a single VARCHAR2(4000).

Building a single expression for an execution plan out of the row expressions is again something that can be addressed by well-known techniques that transform rows to columns (basically a pivot operation which is explicitly supported by the PIVOT operator introduced in 11g), or in this particular case you could call it a "Rows to string" or "String Aggregation" operation.

There are several ways how to accomplish this using SQL, some of them are e.g. demonstrated on the SQL snippets site.

In this particular case again we have the potential issue that the resulting string could exceed the 4,000 bytes limit imposed by the VARCHAR2 data type, and therefore some of the available techniques might not be appropriate.

For instance the "hierarchical query" approach for string aggregation will fail with "result of string concatenation too long" in the SYS_CONNECT_BY_PATH function used to concatenate the string in such cases.

Using nested tables, the build-in 10g COLLECT aggregate function and a custom TO_STRING function that is capable of handling and generating CLOBs can not be used either because Oracle doesn't guarantee the order of aggregates and therefore the order of the aggregated strings might not be the same for the same input data leading to different hash values.


In 10g there are a couple of favorable approaches to this:

1. Use the MODEL clause to perform the aggregation. This allows to control the aggregation process and therefore can prevent any error caused by strings longer than 4,000 characters/bytes. Furthermore the model clause implicitly defines an order of processing by the dimensions defined in the model. The order is important in our case, since the concatenated values need to be ordered always in the same way, otherwise the same execution plan might result in different hash values.

2. Use some other built-in aggregation method capable of handling CLOBs, e.g. the XMLType based XMLAGG function can return a CLOB from the resulting XMLType (but depending on what you do be aware of specialties like the treatment of XML special characters like ampersand, less than, greater than, single-quote and double-quote, which can be addressed using the DBMS_XMLGEN.CONVERT function, which is in our case not an issue, since we simply what to generate a hash value and are not interested in any special character treatment as along as it is done consistently).

3. Use a custom aggregation function, similar to Tom Kyte's STRAGG custom aggregate function. This one also needs special treatment to address the requirement to order the aggregates in a deterministic way.

4. Use nested tables and the ORA_HASH function. This seems to be the most straightforward solution in 10g, since it addresses the sort order issue due to the implicit ordering performed by the ORA_HASH function, avoids any costly LOB operations and requires the least coding effort.

In pre-10g versions there are only a couple of options left:

1. The MODEL clause is not supported on pre-10g

2. Use some other built-in aggregation method capable of handling CLOBs, e.g. the XMLAGG function is supported in 9i

3. Use a custom aggregation function

4. ORA_HASH is not supported either

One particular issue in 9i is that it doesn't provide a built-in HASH function that supports large concatenations. Neither DBMS_CRYPTO.HASH nor ORA_HASH are supported.

Examples

So here are some examples for the aforementioned options. If you're going to use V$SQL_PLAN as source then note that in order to minimize the latch contention and the risk of inconsistencies when reading from V$SQL_PLAN (Oracle doesn't guarantee any read consistency on dynamic performance views) you might want to populate a (global temporary) table from the V$SQL_PLAN or use at least the (still undocumented) MATERIALIZE hint of the subquery factoring WITH clause to generate a global temporary table on the fly from the contents.

For the following examples a simple table has been created via CTAS:


create table my_sql_plan
as
select * from v$sql_plan;


Note that the examples cover all available columns from V$SQL_PLAN (Version 10.2.0.4 in that case) starting with the OPERATION column up to and except for the OTHER_XML column.

This means that you get different hash values even if e.g. only the BYTES or CPU_COST columns are different. For a reasonable analysis this is probably too granular and you should omit those columns that you don't want to consider for the hash value calculation.

1. Calculating a plan hash value using the MODEL clause for string aggregation


select
hash_value
, plan_hash_value
, child_number
, sys.dbms_crypto.hash(aggr1 || aggr2 || aggr3 || dummy_clob, 2) as the_hash
from (
select
hash_value
, plan_hash_value
, child_number
, aggr1
, aggr2
, aggr3
from (
select
hash_value
, plan_hash_value
, child_number
, id
, dense_rank() over (order by hash_value, plan_hash_value, child_number) as rnk
, sys.dbms_crypto.hash(utl_raw.cast_to_raw(
operation
|| '-' || options
|| '-' || object_node
|| '-' || object#
|| '-' || object_owner
|| '-' || object_name
|| '-' || object_alias
|| '-' || object_type
|| '-' || optimizer
|| '-' || id
|| '-' || parent_id
|| '-' || depth
|| '-' || position
|| '-' || search_columns
|| '-' || cost
|| '-' || cardinality
|| '-' || bytes
|| '-' || other_tag
|| '-' || partition_start
|| '-' || partition_stop
|| '-' || partition_id
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(other, 1, 2000)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(other, 2001)), 2)
|| '-' || distribution
|| '-' || cpu_cost
|| '-' || io_cost
|| '-' || temp_space
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(access_predicates, 1, 2000)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(access_predicates, 2001)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(filter_predicates, 1, 2000)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(filter_predicates, 2001)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(projection, 1, 2000)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(projection, 2001)), 2)
|| '-' || time
|| '-' || qblock_name
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(remarks, 1, 2000)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(remarks, 2001)), 2)
)
, 2) as hash_path_row
from (
select
*
from
my_sql_plan
)
)
model
return updated rows
partition by (
rnk
)
dimension by (
id
)
measures (
cast(hash_path_row as varchar2(4000)) as aggr1
, cast('' as varchar2(4000)) as aggr2
, cast('' as varchar2(4000)) as aggr3
, hash_value
, plan_hash_value
, child_number
)
rules
iterate (10000)
until presentv(aggr1[ITERATION_NUMBER + 2], 1, 2) = 2 (
aggr1[0] = aggr1[0] ||
case
when length(aggr1[0]) + length(aggr1[ITERATION_NUMBER + 1]) + 1 > 4000
or length(aggr2[0]) + length(aggr1[ITERATION_NUMBER + 1]) + 1 > 4000
then ''
else '-' || aggr1[ITERATION_NUMBER+1]
end,
aggr2[0] = aggr2[0] ||
case
when length(aggr1[0]) + length(aggr1[ITERATION_NUMBER + 1]) + 1 > 4000
and length(aggr2[0]) + length(aggr1[ITERATION_NUMBER + 1]) + 1 < 4000
then '-' || aggr1[ITERATION_NUMBER+1]
else ''
end,
aggr3[0] = aggr3[0] ||
case
when length(aggr2[0]) + length(aggr1[ITERATION_NUMBER + 1]) + 1 > 4000
then '-' || aggr1[ITERATION_NUMBER+1]
else ''
end
)
),
(
select
substr(other_xml, 1, 0) as dummy_clob
from
my_sql_plan
where
rownum <= 1
);


Note that a dummy CLOB is used for the final string concatenation. Obviously other hash functions instead of DBMS_CRYPTO.HASH could be used but are limited by the 4,000 bytes maximum size of VARCHAR2 (except for ORA_HASH, see below). Due to the RAW input type the columns exceeding 2,000 bytes are split apart, since a RAW value can have a maximum of 2,000 bytes. However I've encountered different behaviour, under some circumstances the conversion failed when more than 2,000 bytes were in the VARCHAR2 data, but some other tests were successfully able to work on converted VARCHAR2 values larger than 2,000 bytes.

2. Calculating a plan hash value using the XMLAGG function for string aggregation


select
hash_value
, plan_hash_value
, child_number
, sys.dbms_crypto.hash(
substr(
dbms_xmlgen.convert(
extract(
xmlagg(
xmlelement("V", '-' || rawtohex(hash_path_row)
)
order by rawtohex(hash_path_row)
), '/V/text()'
).getclobval()
, 1
)
, 2
)
, 2
) as the_hash
from (
select
hash_value
, plan_hash_value
, child_number
, hash_path_row
from (
select
hash_value
, plan_hash_value
, child_number
, id
, dense_rank() over (order by hash_value, plan_hash_value, child_number) as rnk
, sys.dbms_crypto.hash(utl_raw.cast_to_raw(
operation
|| '-' || options
|| '-' || object_node
|| '-' || object#
|| '-' || object_owner
|| '-' || object_name
|| '-' || object_alias
|| '-' || object_type
|| '-' || optimizer
|| '-' || id
|| '-' || parent_id
|| '-' || depth
|| '-' || position
|| '-' || search_columns
|| '-' || cost
|| '-' || cardinality
|| '-' || bytes
|| '-' || other_tag
|| '-' || partition_start
|| '-' || partition_stop
|| '-' || partition_id
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(other, 1, 2000)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(other, 2001)), 2)
|| '-' || distribution
|| '-' || cpu_cost
|| '-' || io_cost
|| '-' || temp_space
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(access_predicates, 1, 2000)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(access_predicates, 2001)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(filter_predicates, 1, 2000)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(filter_predicates, 2001)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(projection, 1, 2000)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(projection, 2001)), 2)
|| '-' || time
|| '-' || qblock_name
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(remarks, 1, 2000)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(remarks, 2001)), 2)
)
, 2) as hash_path_row
from (
select
*
from
my_sql_plan
)
)
)
group by
hash_value
, plan_hash_value
, child_number;


Note that I've added the DBMS_XMLGEN.CONVERT function for convenience to demonstrate its usage to overcome the handling of XML special characters like ampersand, greater than etc. Again different hash functions instead of DBMS_CRYPTO.HASH could be used but are limited by the 4,000 bytes maximum size of VARCHAR2 (except for ORA_HASH, see below).

3. Calculating a plan hash value using a user-defined aggregation function for string aggregation


drop function aggregate_concat_ord;

drop type agg_concat_ord;

drop type table_of_varchar;

create or replace type table_of_varchar as table of varchar2(4000);
/

create or replace type agg_concat_ord as object
(
vals table_of_varchar,
static function ODCIAggregateInitialize (init_context IN OUT agg_concat_ord) return number,
member function ODCIAggregateIterate (self IN OUT agg_concat_ord, this_value IN varchar2) return number,
member function ODCIAggregateTerminate (self IN agg_concat_ord, result OUT clob, flags IN number) return number,
member function ODCIAggregateMerge (self IN OUT agg_concat_ord, merge_context IN agg_concat_ord) return number
);
/

create or replace type body agg_concat_ord is
static function ODCIAggregateInitialize (init_context in out agg_concat_ord) return number is
begin
init_context := agg_concat_ord (null);
init_context.vals := table_of_varchar();
return ODCIConst.Success;
end;
member function ODCIAggregateIterate (self in out agg_concat_ord, this_value in varchar2) return number is
begin
self.vals.extend;
self.vals (self.vals.last) := this_value;
return ODCIConst.Success;
end;
member function ODCIAggregateTerminate (self IN agg_concat_ord, result OUT clob, flags IN number ) return number is
begin
for r in (select column_value from table (self.vals) order by 1) loop
result := result || r.column_value;
end loop;
return ODCIConst.Success;
end;
member function ODCIAggregateMerge (self IN OUT agg_concat_ord, merge_context IN agg_concat_ord) return number is
i integer;
begin
i := merge_context.vals.first;
while i is not null loop
self.vals.extend;
self.vals (self.vals.last) := merge_context.vals (i);
i := merge_context.vals.next (i);
end loop;
return ODCIConst.Success;
end;
end;
/

create or replace function aggregate_concat_ord (input varchar2) return clob parallel_enable
aggregate using agg_concat_ord;
/

select
hash_value
, plan_hash_value
, child_number
, sys.dbms_crypto.hash(substr(aggregate_concat_ord('-' || rawtohex(hash_path_row)), 2), 2) as the_hash
from (
select
hash_value
, plan_hash_value
, child_number
, hash_path_row
from (
select
hash_value
, plan_hash_value
, child_number
, id
, dense_rank() over (order by hash_value, plan_hash_value, child_number) as rnk
, sys.dbms_crypto.hash(utl_raw.cast_to_raw(
operation
|| '-' || options
|| '-' || object_node
|| '-' || object#
|| '-' || object_owner
|| '-' || object_name
|| '-' || object_alias
|| '-' || object_type
|| '-' || optimizer
|| '-' || id
|| '-' || parent_id
|| '-' || depth
|| '-' || position
|| '-' || search_columns
|| '-' || cost
|| '-' || cardinality
|| '-' || bytes
|| '-' || other_tag
|| '-' || partition_start
|| '-' || partition_stop
|| '-' || partition_id
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(other, 1, 2000)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(other, 2001)), 2)
|| '-' || distribution
|| '-' || cpu_cost
|| '-' || io_cost
|| '-' || temp_space
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(access_predicates, 1, 2000)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(access_predicates, 2001)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(filter_predicates, 1, 2000)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(filter_predicates, 2001)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(projection, 1, 2000)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(projection, 2001)), 2)
|| '-' || time
|| '-' || qblock_name
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(remarks, 1, 2000)), 2)
|| '-' || dbms_crypto.hash(utl_raw.cast_to_raw(substr(remarks, 2001)), 2)
)
, 2) as hash_path_row
from (
select
*
from
my_sql_plan
)
)
)
group by
hash_value
, plan_hash_value
, child_number;


Note that the user-defined aggregation function has two noteworthy characteristics: It ensures an order of the aggregates and it generates a CLOB value.
Again different hash functions could be used instead of DBMS_CRYPTO.HASH with the known limitations.

4. Calculating a plan hash value using the ORA_HASH and a user-defined type for string aggregation


create or replace type ntt_varchar2 as table of varchar2(4000);
/

select
hash_value
, plan_hash_value
, child_number
, ora_hash(cast(collect(to_char(hash_path_row, 'TM')) as ntt_varchar2)) as the_hash
from (
select
hash_value
, plan_hash_value
, child_number
, hash_path_row
from (
select
hash_value
, plan_hash_value
, child_number
, id
, dense_rank() over (order by hash_value, plan_hash_value, child_number) as rnk
, ora_hash(
operation
|| '-' || options
|| '-' || object_node
|| '-' || object#
|| '-' || object_owner
|| '-' || object_name
|| '-' || object_alias
|| '-' || object_type
|| '-' || optimizer
|| '-' || id
|| '-' || parent_id
|| '-' || depth
|| '-' || position
|| '-' || search_columns
|| '-' || cost
|| '-' || cardinality
|| '-' || bytes
|| '-' || other_tag
|| '-' || partition_start
|| '-' || partition_stop
|| '-' || partition_id
|| '-' || ora_hash(other)
|| '-' || distribution
|| '-' || cpu_cost
|| '-' || io_cost
|| '-' || temp_space
|| '-' || ora_hash(access_predicates)
|| '-' || ora_hash(filter_predicates)
|| '-' || ora_hash(projection)
|| '-' || time
|| '-' || qblock_name
|| '-' || ora_hash(remarks)
) as hash_path_row
from (
select
*
from
my_sql_plan
)
)
)
group by
hash_value
, plan_hash_value
, child_number;


This example avoids any costly LOB operations. The string aggregation is performed by the COLLECT function that is available from 10g on. Note that the sort order issue caused by the COLLECT function (actually the lack of any defined order within the aggregate) is addressed by the ORA_HASH function that obviously sorts the passed data in case a collection is passed. This can be indirectly noticed when omitting the CAST operation to the NTT_VARCHAR2 custom collection type.

Here are some variations suitable for 9i databases:

1. Using a custom HASHKEY function built on top of DBMS_OBFUSCATION_TOOLKIT and the user-defined aggregation function from above


create or replace function hashkey(in_string in varchar2) return varchar2 as
begin
if in_string is null then
return to_char(null);
else
return rawtohex(utl_raw.cast_to_raw(sys.dbms_obfuscation_toolkit.md5(input_string=>in_string)));
end if;
end;
/

select
hash_value
--, plan_hash_value
, child_number
, hashkey(substr(aggregate_concat_ord('-' || hash_path_row), 2, 4000)) as the_hash
from (
select
hash_value
--, plan_hash_value
, child_number
, hashkey(
operation
|| '-' || options
|| '-' || object_node
|| '-' || object#
|| '-' || object_owner
|| '-' || object_name
|| '-' || optimizer
|| '-' || id
|| '-' || parent_id
|| '-' || depth
|| '-' || position
|| '-' || search_columns
|| '-' || cost
|| '-' || cardinality
|| '-' || bytes
|| '-' || other_tag
|| '-' || partition_start
|| '-' || partition_stop
|| '-' || partition_id
|| '-' || hashkey(other)
|| '-' || distribution
|| '-' || cpu_cost
|| '-' || io_cost
|| '-' || temp_space
|| '-' || hashkey(access_predicates)
|| '-' || hashkey(filter_predicates)
) as hash_path_row
from
(select * from my_sql_plan
)
)
group by
hash_value
--, plan_hash_value
, child_number;


Note that we have to restrict the final hash generation to the first 4,000 bytes, since the built-in hash functions in 9i don't support LOBs or collections. This means that without a user-defined hash function that supports LOBs or collections in 9i we can't differentiate between plans that have the same first 4,000 characters after aggregation. Since this is very unlikely it should matter only in exceptional cases. Furthermore the V$SQL_PLAN view in 9i doesn't cover the PLAN_HASH_VALUE function, therefore you would need to get that e.g. from V$SQL in case you want to show that.

Obviously the 9i V$SQL_PLAN also misses some of the columns that have been added in 10g (OBJECT_ALIAS, OBJECT_TYPE, PROJECTION, TIME, QBLOCK_NAME, REMARKS etc.).

2. Using an ordered collection and a custom TO_STRING function for string aggregation and the DBMS_UTILITY.GET_HASH_VALUE hash function


create or replace type ntt_varchar2 as table of varchar2(4000);
/

create or replace function to_string (
nt_in in ntt_varchar2,
delimiter_in in varchar2 default ','
) return clob is
v_idx pls_integer;
v_str clob;
v_dlm varchar2(10);
begin
v_idx := nt_in.first;
while v_idx is not null loop
v_str := v_str || v_dlm || nt_in(v_idx);
v_dlm := delimiter_in;
v_idx := nt_in.next(v_idx);
end loop;

return v_str;
end to_string;
/

with a as (
select
hash_value
--, plan_hash_value
, child_number
, dbms_utility.get_hash_value(
operation
|| '-' || options
|| '-' || object_node
|| '-' || object#
|| '-' || object_owner
|| '-' || object_name
|| '-' || optimizer
|| '-' || id
|| '-' || parent_id
|| '-' || depth
|| '-' || position
|| '-' || search_columns
|| '-' || cost
|| '-' || cardinality
|| '-' || bytes
|| '-' || other_tag
|| '-' || partition_start
|| '-' || partition_stop
|| '-' || partition_id
|| '-' || dbms_utility.get_hash_value(other, 0, 2147483647)
|| '-' || distribution
|| '-' || cpu_cost
|| '-' || io_cost
|| '-' || temp_space
|| '-' || dbms_utility.get_hash_value(access_predicates, 0, 2147483647)
|| '-' || dbms_utility.get_hash_value(filter_predicates, 0, 2147483647)
, 0, 2147483647) as hash_path_row
from
(select * from my_sql_plan
)
)
select
hash_value
--, plan_hash_value
, child_number
, dbms_utility.get_hash_value(
substr(to_string(cast(multiset(
select
hash_path_row
from
a
where
a.hash_value = b.hash_value
--and a.plan_hash_value = b.plan_hash_value
and a.child_number = b.child_number
order by
1) as ntt_varchar2), '-'), 1, 4000), 0, 2147483647) as the_hash
from (
select distinct
hash_value
--, plan_hash_value
, child_number
from
a
) b;


Again this needs to be limited to the first 4,000 bytes, and the PLAN_HASH_VALUE in 9i is missing. Furthermore the COLLECT aggregate function is not supported in 9i, so we need to generate the collections using the MULTISET subquery method.

Finally here's an example what you can do with these new hash values:


select
hash_value
, plan_hash_value
, child_number
, the_hash
, case when plan_hash_value = next_plan_hash_value and the_hash != next_the_hash then 'DIFF!' end as are_hashs_diff
from (
select
hash_value
, plan_hash_value
, child_number
, the_hash
, lead(plan_hash_value, 1) over (partition by hash_value, plan_hash_value order by child_number) as next_plan_hash_value
, lead(the_hash, 1) over (partition by hash_value, plan_hash_value order by child_number) as next_the_hash
from (
select
hash_value
, plan_hash_value
, child_number
, ora_hash(cast(collect(to_char(hash_path_row, 'TM')) as ntt_varchar2)) as the_hash
from (
select
hash_value
, plan_hash_value
, child_number
, hash_path_row
from (
select
hash_value
, plan_hash_value
, child_number
, id
, dense_rank() over (order by hash_value, plan_hash_value, child_number) as rnk
, ora_hash(
operation
|| '-' || options
|| '-' || object_node
|| '-' || object#
|| '-' || object_owner
|| '-' || object_name
|| '-' || object_alias
|| '-' || object_type
|| '-' || optimizer
|| '-' || id
|| '-' || parent_id
|| '-' || depth
|| '-' || position
|| '-' || search_columns
|| '-' || cost
|| '-' || cardinality
|| '-' || bytes
|| '-' || other_tag
|| '-' || partition_start
|| '-' || partition_stop
|| '-' || partition_id
|| '-' || ora_hash(other)
|| '-' || distribution
|| '-' || cpu_cost
|| '-' || io_cost
|| '-' || temp_space
|| '-' || ora_hash(access_predicates)
|| '-' || ora_hash(filter_predicates)
|| '-' || ora_hash(projection)
|| '-' || time
|| '-' || qblock_name
|| '-' || ora_hash(remarks)
) as hash_path_row
from (
select
*
from
my_sql_plan
)
)
)
group by
hash_value
, plan_hash_value
, child_number
)
);


This will mark all child cursors for the same statement that have the same PLAN_HASH_VALUE but according to your new hash value are different. The results can be quite surprising.

Wednesday, July 22, 2009

PLAN_HASH_VALUE - How equal (and stable?) are your execution plans - part 1

Oracle provides in recent releases the PLAN_HASH_VALUE information, which according to the documentation, is the following:

"Numerical representation of the SQL plan for the cursor. Comparing one PLAN_HASH_VALUE to another easily identifies whether or not two plans are the same (rather than comparing the two plans line by line)."

So according to the documentation the PLAN_HASH_VALUE can be used as a shortcut to quickly and easily determine if two execution plans are the "same".

I think that the statement "the two execution plans are the same" suggests that execution plans having the same PLAN_HASH_VALUE yield the same or at least similar performance at runtime. An interesting point that I would like to cover here in the following test cases.

This raises the interesting question, what exactly is the PLAN_HASH_VALUE based upon? Obviously it is a hash value calculated using the execution plan information as input. Apart from the fact that hash values are in theory always subject to potential hash collisions, which means that two different inputs can lead to the same hash value, the more interesting question is, which attributes of the execution plan are used as input?

Having this information at hand allows us to get a better understanding if two execution plans with the same PLAN_HASH_VALUE actually have to a yield similar execution profile (which Oracle doesn't say but I assume is a common assumption - or may be misconception?).

Let's start with a simple example to determine what makes the PLAN_HASH_VALUE different. All results shown below come from an 11.1.0.7 Win32 instance with a 8KB default block size, a MSSM 8KB LMT tablespace and default system statistics.


SQL>
SQL> drop table plan_hash_value_test1 purge;

Table dropped.

SQL>
SQL> drop table plan_hash_value_test2 purge;

Table dropped.

SQL>
SQL> drop table plan_hash_value_test3 purge;

Table dropped.

SQL>
SQL> drop user test_user_plan_hash_value cascade;

User dropped.

SQL>
SQL> create user test_user_plan_hash_value
2 identified by test_user_plan_hash_value
3 default tablespace test_8k
4 quota unlimited on test_8k;

User created.

SQL>
SQL> create table plan_hash_value_test1
2 as
3 select
4 id as id1
5 , mod(id, 2) as id2
6 , rpad('x', 20) as small_vc
7 from
8 (
9 select
10 level as id
11 from
12 dual
13 connect by
14 level <= 1000
15 );

Table created.

SQL>
SQL> exec dbms_stats.gather_table_stats(null, 'plan_hash_value_test1')

PL/SQL procedure successfully completed.

SQL>
SQL> create index plan_hash_value_test1_idx1 on plan_hash_value_test1 (id1, id2);

Index created.

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> select /*+ find_me
2 index(a, plan_hash_value_test1_idx1) */
3 *
4 from
5 plan_hash_value_test1 a
6 where
7 id1 = 1
8 and rownum <= 1;

ID1 ID2 SMALL_VC
---------- ---------- --------------------
1 1 x

SQL>
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID 3u4sfg1kzqtct, child number 0
-------------------------------------
select /*+ find_me index(a, plan_hash_value_test1_idx1) */
* from plan_hash_value_test1 a where id1 = 1
and rownum <= 1

Plan hash value: 2655295642

-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| PLAN_HASH_VALUE_TEST1 | 1 | 27 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | PLAN_HASH_VALUE_TEST1_IDX1 | 1 | | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<=1)
3 - access("ID1"=1)


23 rows selected.

SQL>
SQL> select
2 object#, object_owner, object_name
3 from
4 v$sql_plan
5 where
6 operation = 'INDEX'
7 and sql_id = (select
8 sql_id
9 from
10 v$sql
11 where
12 sql_text like 'select /*+ find_me%'
13 );

OBJECT# OBJECT_OWNER OBJECT_NAME
---------- ------------------------------ ------------------------------
76267 CBO_TEST PLAN_HASH_VALUE_TEST1_IDX1

SQL>
SQL> pause

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> drop index plan_hash_value_test1_idx1;

Index dropped.

SQL>
SQL> create index test_user_plan_hash_value.plan_hash_value_test1_idx1 on plan_hash_value_test1 (id2, id1);

Index created.

SQL>
SQL> exec dbms_stats.set_index_stats('test_user_plan_hash_value', 'plan_hash_value_test1_idx1', numdist=>0, numrows=>0)

PL/SQL procedure successfully completed.

SQL>
SQL> select /*+ find_me
2 index(a, plan_hash_value_test1_idx1) */
3 *
4 from
5 plan_hash_value_test1 a
6 where
7 id2 = 1
8 and rownum <= 2;

ID1 ID2 SMALL_VC
---------- ---------- --------------------
1 1 x
3 1 x

SQL>
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID dsjuqc6fkfffj, child number 0
-------------------------------------
select /*+ find_me index(a, plan_hash_value_test1_idx1) */
* from plan_hash_value_test1 a where id2 = 1
and rownum <= 2

Plan hash value: 2655295642

-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| PLAN_HASH_VALUE_TEST1 | 2 | 54 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | PLAN_HASH_VALUE_TEST1_IDX1 | 1 | | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<=2)
3 - access("ID2"=1)


23 rows selected.

SQL>
SQL> select
2 object#, object_owner, object_name
3 from
4 v$sql_plan
5 where
6 operation = 'INDEX'
7 and sql_id = (select
8 sql_id
9 from
10 v$sql
11 where
12 sql_text like 'select /*+ find_me%'
13 );

OBJECT# OBJECT_OWNER OBJECT_NAME
---------- ------------------------------ ------------------------------
76268 TEST_USER_PLAN_HASH_VALUE PLAN_HASH_VALUE_TEST1_IDX1

SQL>
SQL> pause

SQL>
SQL> drop index test_user_plan_hash_value.plan_hash_value_test1_idx1;

Index dropped.

SQL>
SQL> create index plan_hash_value_test1_idx1 on plan_hash_value_test1 (id2, id1);

Index created.

SQL>
SQL> select /*+ find_me
2 index(a, plan_hash_value_test1_idx1) */
3 *
4 from
5 plan_hash_value_test1 a
6 where
7 id2 = 1
8 and rownum <= 2;

ID1 ID2 SMALL_VC
---------- ---------- --------------------
1 1 x
3 1 x

SQL>
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID dsjuqc6fkfffj, child number 0
-------------------------------------
select /*+ find_me index(a, plan_hash_value_test1_idx1) */
* from plan_hash_value_test1 a where id2 = 1
and rownum <= 2

Plan hash value: 2655295642

-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| PLAN_HASH_VALUE_TEST1 | 2 | 54 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | PLAN_HASH_VALUE_TEST1_IDX1 | 500 | | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<=2)
3 - access("ID2"=1)


23 rows selected.

SQL>
SQL> select
2 object#, object_owner, object_name
3 from
4 v$sql_plan
5 where
6 operation = 'INDEX'
7 and sql_id = (select
8 sql_id
9 from
10 v$sql
11 where
12 sql_text like 'select /*+ find_me%'
13 );

OBJECT# OBJECT_OWNER OBJECT_NAME
---------- ------------------------------ ------------------------------
76269 CBO_TEST PLAN_HASH_VALUE_TEST1_IDX1

SQL>


Note the subtle differences: We have an index on a simple table, but these are actually three different objects, once owned by a different user and once with a different definition, but all three statements get the same PLAN_HASH_VALUE.

So this simple example already raises some of the most important points:

- The OBJECT_ID of an object obviously doesn't get used for the PLAN_HASH_VALUE calculation. Although present in the V$SQL_PLAN view, checking the PLAN_TABLE definition suggests why: It's not part of the PLAN_TABLE definition. This means that e.g. replacing an index with a different one but keeping the name will result in the same PLAN_HASH_VALUE if the operations and their order remain the same, like demonstrated (Although I have to admit that doing so might be called bad practice).

- The owner of an object doesn't get used either. Why this is so, remains unclear, since it seems to be part of all related object definitions (OBJECT_OWNER attribute in PLAN_TABLE/V$SQL_PLAN), but might be an very important point if you have multiple schemas with objects of the same name but representing different data or data volume. As you can see from the example the PLAN_HASH_VALUE is the same but the number of rows estimated is different due to the different underlying object (and its intentionally manipulated statistics).

- One of the most crucial aspects is also demonstrated: The FILTER_PREDICATES and ACCESS_PREDICATES information is NOT part of the PLAN_HASH_VALUE. This means that two executions plans can have the same PLAN_HASH_VALUE but behave significantly differently at actual execution time.

- What also is obvious from this example is that the ROWS and BYTES information is not used as part of the PLAN_HASH_VALUE. More on this later.

Let's move on to the next example:


SQL>
SQL> variable b2 number
SQL>
SQL> exec :b2 := 1;

PL/SQL procedure successfully completed.

SQL>
SQL> select /*+ find_me
2 index(a, plan_hash_value_test1_idx1) */
3 *
4 from
5 plan_hash_value_test1 a
6 where
7 id2 = :b2
8 and rownum <= 1;

ID1 ID2 SMALL_VC
---------- ---------- --------------------
1 1 x

SQL>
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID 4y6b2pu1hzs3h, child number 0
-------------------------------------
select /*+ find_me index(a, plan_hash_value_test1_idx1) */
* from plan_hash_value_test1 a where id2 = :b2
and rownum <= 1

Plan hash value: 2655295642

-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| PLAN_HASH_VALUE_TEST1 | 1 | 27 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | PLAN_HASH_VALUE_TEST1_IDX1 | 500 | | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<=1)
3 - access("ID2"=:B2)


23 rows selected.

SQL>
SQL> variable b2 varchar2(20)
SQL>
SQL> exec :b2 := '1';

PL/SQL procedure successfully completed.

SQL>
SQL> select /*+ find_me
2 index(a, plan_hash_value_test1_idx1) */
3 *
4 from
5 plan_hash_value_test1 a
6 where
7 id2 = :b2
8 and rownum <= 1;

ID1 ID2 SMALL_VC
---------- ---------- --------------------
1 1 x

SQL>
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID 4y6b2pu1hzs3h, child number 1
-------------------------------------
select /*+ find_me index(a, plan_hash_value_test1_idx1) */
* from plan_hash_value_test1 a where id2 = :b2
and rownum <= 1

Plan hash value: 2655295642

-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| PLAN_HASH_VALUE_TEST1 | 1 | 27 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | PLAN_HASH_VALUE_TEST1_IDX1 | 500 | | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<=1)
3 - access("ID2"=TO_NUMBER(:B2))


23 rows selected.

SQL>


Whereas the previous examples mainly used different SQLs (and therefore had different SQL_HASH_VALUEs or SQL_IDs) this example uses the same SQL and demonstrates the following:

- Bad application behaviour (in this case different types of bind variables) or other reasons can lead to unshared cursors, i.e. multiple child cursors for the same SQL (not actually the main topic here)

- But although the different cursors have different access predicates (in this case the implicit type conversion), as already shown, different access predicates don't lead to different PLAN_HASH_VALUEs

For this particular statement the difference in the predicates very likely doesn't represent a threat, but there are more subtle cases where these differences can lead to significant changes in behaviour. Possible reasons are different order of the predicate evaluation (e.g. when having system statistics enabled, which is the default from 10g on) which can make a significant difference in CPU usage or general resource consumption depending on the actual cost of the predicate evaluation (e.g. a costly PL/SQL function call), or even the evaluation of (filter) predicates at different steps of the same execution plan which could lead to significant differences in the number of rows generated by each operation step of the execution plan, and therefore make an execution plan much more inefficient due to the larger number of rows processed.

What about the actual estimates associated with the particular operations of an execution plan?


SQL>
SQL> explain plan for
2 select /*+ full(a) */
3 *
4 from
5 plan_hash_value_test1 a
6 where
7 id2 = 1;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3758120161

-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 500 | 13500 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| PLAN_HASH_VALUE_TEST1 | 500 | 13500 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("ID2"=1)

13 rows selected.

SQL>
SQL> exec dbms_stats.set_table_stats(null, 'plan_hash_value_test1', numrows=>100000, numblks=>10000)

PL/SQL procedure successfully completed.

SQL>
SQL> explain plan for
2 select /*+ full(a) */
3 *
4 from
5 plan_hash_value_test1 a
6 where
7 id2 = 1;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3758120161

-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 50000 | 1318K| 2733 (1)| 00:00:33 |
|* 1 | TABLE ACCESS FULL| PLAN_HASH_VALUE_TEST1 | 50000 | 1318K| 2733 (1)| 00:00:33 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("ID2"=1)

13 rows selected.

SQL>


It's obvious that none of the ROWS, BYTES, COST nor derived information like TIME in the later releases are used to calculate the PLAN_HASH_VALUE. So again this makes clear that the same PLAN_HASH_VALUE of two statements doesn't say anything about the similarity of the runtime performance.

Another example that demonstrates this point with a slightly more complex plan:


SQL>
SQL> exec dbms_stats.gather_table_stats(null, 'plan_hash_value_test1')

PL/SQL procedure successfully completed.

SQL>
SQL> create table plan_hash_value_test2
2 as
3 select
4 id as id1
5 , mod(id, 2) as id2
6 , rpad('x', 20) as small_vc
7 from
8 (
9 select
10 level as id
11 from
12 dual
13 connect by
14 level <= 1000
15 );

Table created.

SQL>
SQL> exec dbms_stats.gather_table_stats(null, 'plan_hash_value_test2')

PL/SQL procedure successfully completed.

SQL>
SQL> create index plan_hash_value_test2_idx1 on plan_hash_value_test2 (id2, id1);

Index created.

SQL>
SQL> explain plan for
2 select /*+
3 use_nl(a, b)
4 index(b)
5 leading(a)
6 */
7 *
8 from
9 plan_hash_value_test1 a
10 , plan_hash_value_test2 b
11 where
12 a.id1 = b.id1
13 and a.id2 = b.id2
14 and a.id1 between 1 and 10;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 458854847

-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 486 | 23 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 9 | 486 | 23 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST1 | 10 | 270 | 3 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | PLAN_HASH_VALUE_TEST2_IDX1 | 1 | | 1 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| PLAN_HASH_VALUE_TEST2 | 1 | 27 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("A"."ID1"<=10 AND "A"."ID1">=1)
4 - access("A"."ID2"="B"."ID2" AND "A"."ID1"="B"."ID1")
filter("B"."ID1"<=10 AND "B"."ID1">=1)

19 rows selected.

SQL>
SQL> explain plan for
2 select /*+
3 use_nl(a, b)
4 index(b)
5 leading(a)
6 */
7 *
8 from
9 plan_hash_value_test1 a
10 , plan_hash_value_test2 b
11 where
12 a.id2 = b.id2
13 and a.id1 between 1 and 10;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 458854847

-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5005 | 263K| 74 (2)| 00:00:01 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 5005 | 263K| 74 (2)| 00:00:01 |
|* 3 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST1 | 10 | 270 | 3 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | PLAN_HASH_VALUE_TEST2_IDX1 | 500 | | 2 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| PLAN_HASH_VALUE_TEST2 | 500 | 13500 | 7 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("A"."ID1"<=10 AND "A"."ID1">=1)
4 - access("A"."ID2"="B"."ID2")

18 rows selected.

SQL>


Notice how one statement performs an effective join using appropriate join predicates whereas the other one generates duplicate records. Again, since the filter and access predicates are not evaluated, these two plans get the same PLAN_HASH_VALUE, although we can say that one of them is potentially suboptimal (and usually would result in a significantly different execution plan which in turn would have different PLAN_HASH_VALUEs but there are certainly more complex scenarios where the optimizer goes wrong for whatever reason resulting in such potentially inefficient execution plans).

What about extended execution plan information like partitioning and parallel execution?

Let's first address partitioning:


SQL> create table plan_hash_value_test3
2 (
3 invoice_no number,
4 sale_year integer not null,
5 sale_month integer not null,
6 sale_day integer not null
7 )
8 partition by range (invoice_no)
9 (
10 partition part_001 values less than (100),
11 partition part_002 values less than (400),
12 partition part_003 values less than (800),
13 partition part_004 values less than (maxvalue)
14 );

Table created.

SQL>
SQL> exec dbms_random.seed(0)

PL/SQL procedure successfully completed.

SQL>
SQL> insert into plan_hash_value_test3 (
2 invoice_no,
3 sale_year,
4 sale_month,
5 sale_day
6 )
7 select rownum,
8 2000 + round(dbms_random.value(0, 8)) as sale_year,
9 trunc(dbms_random.value(1, 13)) as sale_month,
10 trunc(dbms_random.value(1, 29)) as sale_day
11 from dual
12 connect by level <= 1000;

1000 rows created.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> exec dbms_stats.gather_table_stats(null, 'plan_hash_value_test3')

PL/SQL procedure successfully completed.

SQL>
SQL> exec dbms_stats.set_table_stats(null, 'plan_hash_value_test3', partname=>'part_002', numblks=>100000)

PL/SQL procedure successfully completed.

SQL>
SQL> select
2 partition_name
3 , blocks
4 from
5 user_tab_statistics
6 where
7 table_name = 'PLAN_HASH_VALUE_TEST3';

PARTITION_NAME BLOCKS
------------------------------ ----------
5
PART_001 1
PART_002 100000
PART_003 2
PART_004 1

SQL>
SQL> explain plan for
2 select
3 *
4 from
5 plan_hash_value_test3
6 where
7 invoice_no < 100;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 4079248530

----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 99 | 1188 | 2 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 99 | 1188 | 2 (0)| 00:00:01 | 1 | 1 |
| 2 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST3 | 99 | 1188 | 2 (0)| 00:00:01 | 1 | 1 |
----------------------------------------------------------------------------------------------------------------

9 rows selected.

SQL>
SQL> explain plan for
2 select
3 *
4 from
5 plan_hash_value_test3
6 where
7 invoice_no >= 100 and invoice_no < 400;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 4079248530

----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 300 | 3900 | 27256 (1)| 00:05:28 | | |
| 1 | PARTITION RANGE SINGLE| | 300 | 3900 | 27256 (1)| 00:05:28 | 2 | 2 |
| 2 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST3 | 300 | 3900 | 27256 (1)| 00:05:28 | 2 | 2 |
----------------------------------------------------------------------------------------------------------------

9 rows selected.

SQL>
SQL> variable b1 number
SQL> variable b2 number
SQL>
SQL> exec :b1 := 0;

PL/SQL procedure successfully completed.

SQL> exec :b2 := 100;

PL/SQL procedure successfully completed.

SQL>
SQL> select
2 count(*)
3 from
4 plan_hash_value_test3
5 where
6 invoice_no >= :b1 and invoice_no < :b2;

COUNT(*)
----------
99

SQL>
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID 23m1jk6dbsvbx, child number 0
-------------------------------------
select count(*) from plan_hash_value_test3 where
invoice_no >= :b1 and invoice_no < :b2

Plan hash value: 2447373661

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2 (100)| | | |
| 1 | SORT AGGREGATE | | 1 | 3 | | | | |
|* 2 | FILTER | | | | | | | |
| 3 | PARTITION RANGE ITERATOR| | 99 | 297 | 2 (0)| 00:00:01 | KEY | KEY |
|* 4 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST3 | 99 | 297 | 2 (0)| 00:00:01 | KEY | KEY |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter(:B1<:B2)
4 - filter(("INVOICE_NO">=:B1 AND "INVOICE_NO"<:B2))


23 rows selected.

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> exec :b1 := 100;

PL/SQL procedure successfully completed.

SQL> exec :b2 := 400;

PL/SQL procedure successfully completed.

SQL>
SQL> select
2 count(*)
3 from
4 plan_hash_value_test3
5 where
6 invoice_no >= :b1 and invoice_no < :b2;

COUNT(*)
----------
300

SQL>
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID 23m1jk6dbsvbx, child number 0
-------------------------------------
select count(*) from plan_hash_value_test3 where
invoice_no >= :b1 and invoice_no < :b2

Plan hash value: 2447373661

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 27256 (100)| | | |
| 1 | SORT AGGREGATE | | 1 | 4 | | | | |
|* 2 | FILTER | | | | | | | |
| 3 | PARTITION RANGE ITERATOR| | 300 | 1200 | 27256 (1)| 00:05:28 | KEY | KEY |
|* 4 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST3 | 300 | 1200 | 27256 (1)| 00:05:28 | KEY | KEY |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter(:B1<:B2)
4 - filter(("INVOICE_NO">=:B1 AND "INVOICE_NO"<:B2))


23 rows selected.

SQL>


The PSTART and PSTOP information is not used either for the PLAN_HASH_VALUE, so different execution plans accessing different partitions of the same object might get the same PLAN_HASH_VALUE, but again the runtime performance might be dramatically different.

By the way above example demonstrates that the CBO peeks at the binds and uses then the partition level statistics of the corresponding partition defined by the bind values (in case the bind values prune to a single partition), although the execution plan might be executed with different bind values actually accessing different partitions at runtime.

What about parallel execution?


SQL>
SQL> explain plan for
2 select /*+
3 full(a)
4 full(b)
5 use_hash(a, b)
6 parallel(a)
7 parallel(b)
8 pq_distribute(b, none, broadcast)
9 */
10 *
11 from
12 plan_hash_value_test1 a
13 , plan_hash_value_test2 b
14 where
15 a.id1 = b.id1
16 and a.id2 = b.id2
17 and a.id1 between 1 and 10;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1365899609

-------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 486 | 5 (20)| 00:00:01 | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 9 | 486 | 5 (20)| 00:00:01 | Q1,01 | P->S | QC (RAND) |
|* 3 | HASH JOIN | | 9 | 486 | 5 (20)| 00:00:01 | Q1,01 | PCWP | |
| 4 | PX BLOCK ITERATOR | | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | PCWC | |
|* 5 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST1 | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | PCWP | |
| 6 | BUFFER SORT | | | | | | Q1,01 | PCWC | |
| 7 | PX RECEIVE | | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | PCWP | |
| 8 | PX SEND BROADCAST | :TQ10000 | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | P->P | BROADCAST |
| 9 | PX BLOCK ITERATOR | | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | PCWC | |
|* 10 | TABLE ACCESS FULL| PLAN_HASH_VALUE_TEST2 | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | PCWP | |
-------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("A"."ID1"="B"."ID1" AND "A"."ID2"="B"."ID2")
5 - filter("A"."ID1"<=10 AND "A"."ID1">=1)
10 - filter("B"."ID1"<=10 AND "B"."ID1">=1)

24 rows selected.

SQL>
SQL> explain plan for
2 select /*+
3 full(a)
4 full(b)
5 use_hash(a, b)
6 parallel(a)
7 parallel(b)
8 pq_distribute(b, broadcast, none)
9 */
10 *
11 from
12 plan_hash_value_test1 a
13 , plan_hash_value_test2 b
14 where
15 a.id1 = b.id1
16 and a.id2 = b.id2
17 and a.id1 between 1 and 10;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3903716067

------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 486 | 5 (20)| 00:00:01 | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 9 | 486 | 5 (20)| 00:00:01 | Q1,01 | P->S | QC (RAND) |
|* 3 | HASH JOIN | | 9 | 486 | 5 (20)| 00:00:01 | Q1,01 | PCWP | |
| 4 | PX RECEIVE | | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | PCWP | |
| 5 | PX SEND BROADCAST | :TQ10000 | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | P->P | BROADCAST |
| 6 | PX BLOCK ITERATOR | | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | PCWC | |
|* 7 | TABLE ACCESS FULL| PLAN_HASH_VALUE_TEST1 | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | PCWP | |
| 8 | PX BLOCK ITERATOR | | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | PCWC | |
|* 9 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST2 | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | PCWP | |
------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("A"."ID1"="B"."ID1" AND "A"."ID2"="B"."ID2")
7 - filter("A"."ID1"<=10 AND "A"."ID1">=1)
9 - filter("B"."ID1"<=10 AND "B"."ID1">=1)

23 rows selected.

SQL>
SQL> explain plan for
2 select /*+
3 full(a)
4 full(b)
5 use_hash(a, b)
6 parallel(a)
7 parallel(b)
8 pq_distribute(b, hash, hash)
9 */
10 *
11 from
12 plan_hash_value_test1 a
13 , plan_hash_value_test2 b
14 where
15 a.id1 = b.id1
16 and a.id2 = b.id2
17 and a.id1 between 1 and 10;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 904614956

------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 486 | 5 (20)| 00:00:01 | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10002 | 9 | 486 | 5 (20)| 00:00:01 | Q1,02 | P->S | QC (RAND) |
|* 3 | HASH JOIN BUFFERED | | 9 | 486 | 5 (20)| 00:00:01 | Q1,02 | PCWP | |
| 4 | PX RECEIVE | | 10 | 270 | 2 (0)| 00:00:01 | Q1,02 | PCWP | |
| 5 | PX SEND HASH | :TQ10000 | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | P->P | HASH |
| 6 | PX BLOCK ITERATOR | | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | PCWC | |
|* 7 | TABLE ACCESS FULL| PLAN_HASH_VALUE_TEST1 | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | PCWP | |
| 8 | PX RECEIVE | | 10 | 270 | 2 (0)| 00:00:01 | Q1,02 | PCWP | |
| 9 | PX SEND HASH | :TQ10001 | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | P->P | HASH |
| 10 | PX BLOCK ITERATOR | | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | PCWC | |
|* 11 | TABLE ACCESS FULL| PLAN_HASH_VALUE_TEST2 | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | PCWP | |
------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("A"."ID1"="B"."ID1" AND "A"."ID2"="B"."ID2")
7 - filter("A"."ID1"<=10 AND "A"."ID1">=1)
11 - filter("B"."ID1"<=10 AND "B"."ID1">=1)

25 rows selected.

SQL>


The example uses different parallel distribution options for the same execution plan. In 10g and later this is reflected in different operations (like PX SEND BROADCAST) and this suggests that the PLAN_HASH_VALUEs are going to be different due to the different operations.

Running a similar test case on 9.2.0.8 shows it seems that actually some of the attributes related to parallel execution are also used to calculate the PLAN_HASH_VALUE:


SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> explain plan for
2 select /*+ find_me
3 full(a)
4 full(b)
5 use_hash(a, b)
6 parallel(a)
7 parallel(b)
8 pq_distribute(b, none, broadcast)
9 */
10 *
11 from
12 plan_hash_value_test1 a
13 , plan_hash_value_test2 b
14 where
15 a.id1 = b.id1
16 and a.id2 = b.id2
17 and a.id1 between 1 and 10;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 270 | 6 (34)| | | |
|* 1 | HASH JOIN | | 5 | 270 | 6 (34)| 55,01 | P->S | QC (RAND) |
|* 2 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST1 | 10 | 270 | 3 (34)| 55,01 | PCWP | |
|* 3 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST2 | 10 | 270 | 3 (34)| 55,00 | P->P | BROADCAST |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("A"."ID1"="B"."ID1" AND "A"."ID2"="B"."ID2")
2 - filter("A"."ID1"<=10 AND "A"."ID1">=1)
3 - filter("B"."ID1"<=10 AND "B"."ID1">=1)

16 rows selected.

SQL>
SQL> rollback;

Rollback complete.

SQL>
SQL> select /*+ find_me
2 full(a)
3 full(b)
4 use_hash(a, b)
5 parallel(a)
6 parallel(b)
7 pq_distribute(b, none, broadcast)
8 */
9 *
10 from
11 plan_hash_value_test1 a
12 , plan_hash_value_test2 b
13 where
14 a.id1 = b.id1
15 and a.id2 = b.id2
16 and a.id1 between 1 and 10;

ID1 ID2 SMALL_VC ID1 ID2 SMALL_VC
---------- ---------- -------------------- ---------- ---------- --------------------
1 1 x 1 1 x
2 0 x 2 0 x
3 1 x 3 1 x
4 0 x 4 0 x
5 1 x 5 1 x
6 0 x 6 0 x
7 1 x 7 1 x
8 0 x 8 0 x
9 1 x 9 1 x
10 0 x 10 0 x

10 rows selected.

SQL>
SQL> select
2 plan_hash_value
3 , hash_value
4 from
5 v$sql
6 where
7 sql_text like 'select /*+ find_me%';

PLAN_HASH_VALUE HASH_VALUE
--------------- ----------
1709875781 508205717

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> explain plan for
2 select /*+ find_me
3 full(a)
4 full(b)
5 use_hash(a, b)
6 parallel(a)
7 parallel(b)
8 pq_distribute(b, broadcast, none)
9 */
10 *
11 from
12 plan_hash_value_test1 a
13 , plan_hash_value_test2 b
14 where
15 a.id1 = b.id1
16 and a.id2 = b.id2
17 and a.id1 between 1 and 10;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 270 | 6 (34)| | | |
|* 1 | HASH JOIN | | 5 | 270 | 6 (34)| 57,01 | P->S | QC (RAND) |
|* 2 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST1 | 10 | 270 | 3 (34)| 57,00 | P->P | BROADCAST |
|* 3 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST2 | 10 | 270 | 3 (34)| 57,01 | PCWP | |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("A"."ID1"="B"."ID1" AND "A"."ID2"="B"."ID2")
2 - filter("A"."ID1"<=10 AND "A"."ID1">=1)
3 - filter("B"."ID1"<=10 AND "B"."ID1">=1)

16 rows selected.

SQL>
SQL> rollback;

Rollback complete.

SQL>
SQL> select /*+ find_me
2 full(a)
3 full(b)
4 use_hash(a, b)
5 parallel(a)
6 parallel(b)
7 pq_distribute(b, broadcast, none)
8 */
9 *
10 from
11 plan_hash_value_test1 a
12 , plan_hash_value_test2 b
13 where
14 a.id1 = b.id1
15 and a.id2 = b.id2
16 and a.id1 between 1 and 10;

ID1 ID2 SMALL_VC ID1 ID2 SMALL_VC
---------- ---------- -------------------- ---------- ---------- --------------------
1 1 x 1 1 x
2 0 x 2 0 x
3 1 x 3 1 x
4 0 x 4 0 x
5 1 x 5 1 x
6 0 x 6 0 x
7 1 x 7 1 x
8 0 x 8 0 x
9 1 x 9 1 x
10 0 x 10 0 x

10 rows selected.

SQL>
SQL> select
2 plan_hash_value
3 , hash_value
4 from
5 v$sql
6 where
7 sql_text like 'select /*+ find_me%';

PLAN_HASH_VALUE HASH_VALUE
--------------- ----------
441284116 3983849749

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> explain plan for
2 select /*+ find_me
3 full(a)
4 full(b)
5 use_hash(a, b)
6 parallel(a)
7 parallel(b)
8 pq_distribute(b, hash, hash)
9 */
10 *
11 from
12 plan_hash_value_test1 a
13 , plan_hash_value_test2 b
14 where
15 a.id1 = b.id1
16 and a.id2 = b.id2
17 and a.id1 between 1 and 10;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 270 | 6 (34)| | | |
|* 1 | HASH JOIN | | 5 | 270 | 6 (34)| 59,02 | P->S | QC (RAND) |
|* 2 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST1 | 10 | 270 | 3 (34)| 59,00 | P->P | HASH |
|* 3 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST2 | 10 | 270 | 3 (34)| 59,01 | P->P | HASH |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("A"."ID1"="B"."ID1" AND "A"."ID2"="B"."ID2")
2 - filter("A"."ID1"<=10 AND "A"."ID1">=1)
3 - filter("B"."ID1"<=10 AND "B"."ID1">=1)

16 rows selected.

SQL>
SQL> rollback;

Rollback complete.

SQL>
SQL> select /*+ find_me
2 full(a)
3 full(b)
4 use_hash(a, b)
5 parallel(a)
6 parallel(b)
7 pq_distribute(b, hash, hash)
8 */
9 *
10 from
11 plan_hash_value_test1 a
12 , plan_hash_value_test2 b
13 where
14 a.id1 = b.id1
15 and a.id2 = b.id2
16 and a.id1 between 1 and 10;

ID1 ID2 SMALL_VC ID1 ID2 SMALL_VC
---------- ---------- -------------------- ---------- ---------- --------------------
1 1 x 1 1 x
3 1 x 3 1 x
5 1 x 5 1 x
8 0 x 8 0 x
10 0 x 10 0 x
2 0 x 2 0 x
4 0 x 4 0 x
6 0 x 6 0 x
7 1 x 7 1 x
9 1 x 9 1 x

10 rows selected.

SQL>
SQL> select
2 plan_hash_value
3 , hash_value
4 from
5 v$sql
6 where
7 sql_text like 'select /*+ find_me%';

PLAN_HASH_VALUE HASH_VALUE
--------------- ----------
353704519 1797852549

SQL>


Although the main operations and their order of execution stays the same, the calculated PLAN_HASH_VALUE is different in all three cases.

So in summary the following conclusions can be made:

- The same PLAN_HASH_VALUE is merely an indicator that the same operations on objects of the same name are performed in the same order.

- It tells nothing about the similarity of the expected runtime performance of the execution plan, due to various reasons as demonstrated. The most significant information that is not covered by the PLAN_HASH_VALUE are the filter and access predicates, but there are other attributes, too, that are not part of the hash value calculation.

- Still it's very likely that for the same SQL statement and the same unchanged underlying data the same PLAN_HASH_VALUE indicates similar expected runtime performance, except for some particular cases where the execution plan itself already contains "conditional" execution paths which might be taken by the runtime engine and therefore lead to significantly different runtimes. Some of these examples have already been covered by Jonathan Lewis: Conditional Plan and Hierarchical Queries in some versions of Oracle.

In the second part I'll demonstrate a possible approach how to calculate your own PLAN_HASH_VALUE that covers some of the information omitted if you suspect that you might have encountered a scenario where the same PLAN_HASH_VALUE suggests similar runtime performance but there are significant differences in the execution plans that you would like to have covered by the hash value calculation to detect these plan changes easily.