Monday, July 23, 2007

Subpartitions and optimizer statistics

UPDATE: If you came here because you seem to have issues with execution plans after upgrading to 10.2.0.4, please read this note about a severe bug in optimizer of the 10.2.0.4 patch set regarding subpartition pruning to a single subpartition.

Recently I came across the following issue when analyzing some long running SQL statements at a customer site:

Some of their tables are using range-list composite partitioning, and the list subpartitions represent feeds which are loaded into the database by a generic ETL process. Some of these feeds are fairly large (millions of rows) but some others can be small (like only 1000 rows). So the resulting subpartitions differ dramatically in size. Statistics on subpartition level are updated properly after each data load and represent the different sizes very well. So far, so good.

In the course of the process, some of these subpartitions are then joined to some other big tables, again having millions of rows. The large table being joined has an index on the join column which is unique. The SQL being generated uses explicit subpartition pruning so that the optimizer knows at parse time exactly which single subpartition will be accessed by the statement.

Now we observed the following weird behaviour when this join took place: Regardless of the size of the subpartition being joined to the large table, the optimizer always chose a hash join with full tablescans on both tables. Even when a very small subpartition was joined to the large table, when running the statement in serial mode it took up to 15 minutes to complete the statement due to full tablescan on the large table.

Now we started to check all the usual things that might go wrong and mislead the optimizer, but all the things we checked worked out to be reasonably set including all the table, index and column statistics in charge.

Still, when looking at the execution plan, it was quite obvious what was driving the optimizer to that hash join: The estimated cardinality of the subpartition was terribly wrong and seemed to be taken from the partition level of the statistics rather than the appropriate subpartition statistics.

In addition the 10g explain plan included an operation called "partition combined iterator" with "KEY,KEY" as partition start and stop criteria which was irritating since the explicit subpartition pruning should give enough information to use exactly that partition as start and stop. The actual full table scan line below that "iterator" in the execution then again showed exactly the same partition start and stop key, as you would expect in this case, but - as already mentioned - the cardinality was way off and correlated to the cardinality of the parent range partition.

So I decided to create a generic test case in order to open a Service Request with Oracle to sort this out. And here is the irritating result (tested on 10gR2, 10.2.0.2, AIX 5.3.0.4).

First testcase focuses on the estimated cardinality when performing explicit subpartition pruning on range-list partitioned tables:

SQL>
SQL> drop table partition_test;

Table dropped.

SQL>
SQL> create table partition_test (
2 x_pkey number not null,
3 x_slice varchar2(20) not null,
4 data1 number
5 )
6 partition by range (x_pkey)
7 subpartition by list (x_slice)
8 (
9 partition pkey_0 values less than (1)
10 (
11 subpartition pkey_0_xxx values (' xxx '),
12 subpartition pkey_0_001 values ('001')
13 ),
14 partition pkey_1 values less than (2)
15 (
16 subpartition pkey_1_xxx values (' xxx '),
17 subpartition pkey_1_001 values ('001'),
18 subpartition pkey_1_101 values ('101')
19 )
20 );

Table created.

OK, now we have a range-list partitioned table with two range partitions and each having several list subpartitions. So let's put some data into it, and the important point here is that it differs in row count.

SQL>
SQL> insert into partition_test (x_pkey, x_slice, data1)
2 select 1, '001', seq
3 from (
4 select level as seq from dual connect by level <= 10
5 );

10 rows created.

SQL>
SQL> insert into partition_test (x_pkey, x_slice, data1)
2 select 1, '101', seq
3 from (
4 select level as seq from dual connect by level <= 1000
5 );

1000 rows created.

SQL>
SQL> begin dbms_stats.gather_table_stats(ownname=>USER,tabname=>'PARTITION_TEST',granularity=>'ALL'); end;
2 /

PL/SQL procedure successfully completed.

SQL>
SQL> select table_name, partition_name, subpartition_name, num_rows from user_tab_statistics where table_name = 'PARTITION_TEST';

TABLE_NAMEִִִִִִִִִִִִִִִִִִִִִPARTITION_NAMEִִִִִִִִִִִִִִִִִSUBPARTITION_NAMEִִִִִִִִִִִִִִִִNUM_ROWS
------------------------------ִ------------------------------ִ------------------------------ִ----------
PARTITION_TESTִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִ1010
PARTITION_TESTִִִִִִִִִִִִִִִִִPKEY_0ִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִ0
PARTITION_TESTִִִִִִִִִִִִִִִִִPKEY_1ִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִ1010
PARTITION_TESTִִִִִִִִִִִִִִִִִPKEY_0ִִִִִִִִִִִִִִִִִִִִִִִִִPKEY_0_XXXִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִ0
PARTITION_TESTִִִִִִִִִִִִִִִִִPKEY_0ִִִִִִִִִִִִִִִִִִִִִִִִִPKEY_0_001ִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִ0
PARTITION_TESTִִִִִִִִִִִִִִִִִPKEY_1ִִִִִִִִִִִִִִִִִִִִִִִִִPKEY_1_XXXִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִ0
PARTITION_TESTִִִִִִִִִִִִִִִִִPKEY_1ִִִִִִִִִִִִִִִִִִִִִִִִִPKEY_1_001ִִִִִִִִִִִִִִִִִִִִִִִִִִִִִ10
PARTITION_TESTִִִִִִִִִִִִִִִִִPKEY_1ִִִִִִִִִִִִִִִִִִִִִִִִִPKEY_1_101ִִִִִִִִִִִִִִִִִִִִִִִִִִִ1000


8 rows selected.


OK, data is in, and we have proper statistics generated. Now let's check some simple cardinality estimations when performing explicit (or implicit, doesn't really matter, the implicit pruning using where clauses might lead to different smaller cardinalities due to the additional filter predicates used to calculate the cardinality, but the main issue is still the same) partition pruning:

SQL>
SQL> explain plan for select data1 from partition_test subpartition (pkey_1_001);

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 315551227

--------------------------------------------------------------------------------------------------------------
|ִIdִִ|ִOperationִִִִִִִִִִִִִִִִִִִ|ִNameִִִִִִִִִִִ|ִRowsִִ|ִBytesִ|ִCostִ(%CPU)|ִTimeִִִִִ|ִPstart|ִPstopִ|
--------------------------------------------------------------------------------------------------------------
|ִִִ0ִ|ִSELECTִSTATEMENTִִִִִִִִִִִִ|ִִִִִִִִִִִִִִִִ|ִִ1010ִ|ִִ4040ִ|ִִִִִ4ִִִ(0)|ִ00:00:01ִ|ִִִִִִִ|ִִִִִִִ|
|ִִִ1ִ|ִִPARTITIONִCOMBINEDִITERATOR|ִִִִִִִִִִִִִִִִ|ִִ1010ִ|ִִ4040ִ|ִִִִִ4ִִִ(0)|ִ00:00:01ִ|ִִִKEYִ|ִִִKEYִ|
|ִִִ2ִ|ִִִTABLEִACCESSִFULLִִִִִִִִִ|ִPARTITION_TESTִ|ִִ1010ִ|ִִ4040ִ|ִִִִִ4ִִִ(0)|ִ00:00:01ִ|ִִִִִ4ִ|ִִִִִ4ִ|
--------------------------------------------------------------------------------------------------------------

9 rows selected.

SQL>
SQL> select /*+ gather_plan_statistics */ data1 from partition_test subpartition (pkey_1_001);

ִִִִִDATA1
----------
ִִִִִִִִִ1
ִִִִִִִִִ2
ִִִִִִִִִ3
ִִִִִִִִִ4
ִִִִִִִִִ5
ִִִִִִִִִ6
ִִִִִִִִִ7
ִִִִִִִִִ8
ִִִִִִִִִ9
ִִִִִִִִ10


10 rows selected.


SQL>
SQL> select * from table(dbms_xplan.display_cursor(NULL, 0, 'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID 8r0m2ku4f5vaw, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ data1 from partition_test subpartition (pkey_1_001)

Plan hash value: 315551227

--------------------------------------------------------------------------------------------------------
|ִIdִִ|ִOperationִִִִִִִִִִִִִִִִִִִ|ִNameִִִִִִִִִִִ|ִStartsִ|ִE-Rowsִ|ִA-Rowsִ|ִִִA-Timeִִִ|ִBuffersִ|
--------------------------------------------------------------------------------------------------------
|ִִִ1ִ|ִִPARTITIONִCOMBINEDִITERATOR|ִִִִִִִִִִִִִִִִ|ִִִִִִ1ִ|ִִִ1010ִ|ִִִִִ10ִ|00:00:00.01ִ|ִִִִִִִ8ִ|
|ִִִ2ִ|ִִִTABLEִACCESSִFULLִִִִִִִִִ|ִPARTITION_TESTִ|ִִִִִִ1ִ|ִִִ1010ִ|ִִִִִ10ִ|00:00:00.01ִ|ִִִִִִִ8ִ|
--------------------------------------------------------------------------------------------------------


13 rows selected.

SQL>

Wow, that's weird. What the heck is the optimizer doing here? Why does it refuse to use the obviously available subpartition statistics? Well, it would probably need a 10053 optimizer trace to get a clue, but I've not done that yet. By the way, the plans are not that detailed in 9iR2 (9.2.0.2 and 9.2.0.8 tested), but the results are the same. So this looks like a feature? Works as designed?

Here's a second testcase showing the impact of the issue. Now we create a similar table (range-list composite partitioning) and in addition a second table which represents our large join table mentioned above. Instead of actually loading a large amount of data I'll instead fake the optimizer statistics so that the tables look to the optimizer like being very large:


SQL>
SQL> drop table partition_test2;

Table dropped.

SQL>
SQL> create table partition_test2 (
2 x_pkey number not null,
3 x_slice varchar2(20) not null,
4 data1 number not null
5 )
6 partition by range (x_pkey)
7 subpartition by list (x_slice)
8 (
9 partition pkey_0 values less than (1)
10 (
11 subpartition pkey_0_xxx values (' xxx '),
12 subpartition pkey_0_001 values ('001')
13 ),
14 partition pkey_1 values less than (2)
15 (
16 subpartition pkey_1_xxx values (' xxx '),
17 subpartition pkey_1_001 values ('001'),
18 subpartition pkey_1_101 values ('101')
19 )
20 );

Table created.

SQL>
SQL> insert into partition_test2 (x_pkey, x_slice, data1)
2 select 1, '001', seq
3 from (
4 select level as seq from dual connect by level <= 10
5 );

10 rows created.

SQL>
SQL> insert into partition_test2 (x_pkey, x_slice, data1)
2 select 1, '101', seq
3 from (
4 select level as seq from dual connect by level <= 1000
5 );

1000 rows created.

SQL>
SQL> begin dbms_stats.gather_table_stats(ownname=>USER,tabname=>'PARTITION_TEST2',granularity=>'ALL'); end;
2 /

PL/SQL procedure successfully completed.

SQL>
SQL> begin dbms_stats.set_table_stats(ownname=>USER,tabname=>'PARTITION_TEST2',partname=>'PKEY_1_101',numrows=>1000000,numblks=>100000); end;
2 /

PL/SQL procedure successfully completed.

SQL>
SQL> begin dbms_stats.set_table_stats(ownname=>USER,tabname=>'PARTITION_TEST2',partname=>'PKEY_1',numrows=>1000010,numblks=>100005); end;
2 /

PL/SQL procedure successfully completed.

SQL>
SQL> begin dbms_stats.set_table_stats(ownname=>USER,tabname=>'PARTITION_TEST2',numrows=>1000010,numblks=>100005); end;
2 /

PL/SQL procedure successfully completed.

SQL>
SQL> declare
2 col_stat_rec dbms_stats.StatRec;
3 col_vals dbms_stats.numarray := dbms_stats.numarray(1, 1000000);
4 begin
5 col_stat_rec.epc := 2;
6 col_vals(1) := 1;
7 col_vals(2) := 1000000;
8 dbms_stats.prepare_column_values(col_stat_rec, col_vals);
9 dbms_stats.set_column_stats(ownname=>USER,tabname=>'PARTITION_TEST2',partname=>'PKEY_1_101',colname=>'DATA1',distcnt=>1000000,density=>1/1000000, srec=>col_stat_rec);
10 dbms_stats.set_column_stats(ownname=>USER,tabname=>'PARTITION_TEST2',partname=>'PKEY_1',colname=>'DATA1',distcnt=>1000000,density=>1/1000000, srec=>col_stat_rec);
11 dbms_stats.set_column_stats(ownname=>USER,tabname=>'PARTITION_TEST2',colname=>'DATA1',distcnt=>1000000,density=>1/1000000, srec=>col_stat_rec);
12 end;
13 /

PL/SQL procedure successfully completed.

SQL>
SQL> column table_name format a20
SQL> column index_name format a20
SQL> column partition_name format a10
SQL> column subpartition_name format a15
SQL> column column_name format a15
SQL> column plan_table_output format a300
SQL>
SQL> select table_name, partition_name, subpartition_name, num_rows, blocks from user_tab_statistics where table_name = 'PARTITION_TEST2';

TABLE_NAMEִִִִִִִִִִִPARTITION_ִSUBPARTITION_NAִִִNUM_ROWSִִִִִBLOCKS
--------------------ִ----------ִ---------------ִ----------ִ----------
PARTITION_TEST2ִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִ1000010ִִִִִ100005
PARTITION_TEST2ִִִִִִPKEY_0ִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִ0ִִִִִִִִִִ0
PARTITION_TEST2ִִִִִִPKEY_1ִִִִִִִִִִִִִִִִִִִִִִִִ1000010ִִִִִ100005
PARTITION_TEST2ִִִִִִPKEY_0ִִִִִPKEY_0_XXXִִִִִִִִִִִִִִִ0ִִִִִִִִִִ0
PARTITION_TEST2ִִִִִִPKEY_0ִִִִִPKEY_0_001ִִִִִִִִִִִִִִִ0ִִִִִִִִִִ0
PARTITION_TEST2ִִִִִִPKEY_1ִִִִִPKEY_1_XXXִִִִִִִִִִִִִִִ0ִִִִִִִִִִ0
PARTITION_TEST2ִִִִִִPKEY_1ִִִִִPKEY_1_001ִִִִִִִִִִִִִִ10ִִִִִִִִִִ5
PARTITION_TEST2ִִִִִִPKEY_1ִִִִִPKEY_1_101ִִִִִִִִִ1000000ִִִִִ100000


8 rows selected.


SQL>
SQL> select column_name, num_distinct, density, num_nulls from user_tab_col_statistics where table_name = 'PARTITION_TEST2';

COLUMN_NAMEִִִִִNUM_DISTINCTִִִִDENSITYִִNUM_NULLS
---------------ִ------------ִ----------ִ----------
X_PKEYִִִִִִִִִִִִִִִִִִִִִ1ִִִִִִִִִִ1ִִִִִִִִִִ0
X_SLICEִִִִִִִִִִִִִִִִִִִִ2ִִִִִִִִִ.5ִִִִִִִִִִ0
DATA1ִִִִִִִִִִִִִִִִ1000000ִִִִ.000001ִִִִִִִִִִ0


SQL>
SQL> select partition_name, column_name, num_distinct, density, num_nulls from user_part_col_statistics where table_name = 'PARTITION_TEST2';

PARTITION_ִCOLUMN_NAMEִִִִִNUM_DISTINCTִִִִDENSITYִִNUM_NULLS
----------ִ---------------ִ------------ִ----------ִ----------
PKEY_0ִִִִִX_PKEYִִִִִִִִִִִִִִִִִִִִִ0ִִִִִִִִִִ0ִִִִִִִִִִ0
PKEY_0ִִִִִX_SLICEִִִִִִִִִִִִִִִִִִִִ0ִִִִִִִִִִ0ִִִִִִִִִִ0
PKEY_0ִִִִִDATA1ִִִִִִִִִִִִִִִִִִִִִִ0ִִִִִִִִִִ0ִִִִִִִִִִ0
PKEY_1ִִִִִX_PKEYִִִִִִִִִִִִִִִִִִִִִ1ִִִִִִִִִִ1ִִִִִִִִִִ0
PKEY_1ִִִִִX_SLICEִִִִִִִִִִִִִִִִִִִִ2ִִִִִִִִִ.5ִִִִִִִִִִ0
PKEY_1ִִִִִDATA1ִִִִִִִִִִִִִִִִ1000000ִִִִ.000001ִִִִִִִִִִ0


6 rows selected.

SQL>
SQL> select subpartition_name, column_name, num_distinct, density, num_nulls from user_subpart_col_statistics where table_name = 'PARTITION_TEST2';

SUBPARTITION_NAִCOLUMN_NAMEִִִִִNUM_DISTINCTִִִִDENSITYִִNUM_NULLS
---------------ִ---------------ִ------------ִ----------ִ----------
PKEY_1_101ִִִִִִX_PKEYִִִִִִִִִִִִִִִִִִִִִ1ִִִִִִִִִִ1ִִִִִִִִִִ0
PKEY_1_001ִִִִִִX_PKEYִִִִִִִִִִִִִִִִִִִִִ1ִִִִִִִִִִ1ִִִִִִִִִִ0
PKEY_1_XXXִִִִִִX_PKEYִִִִִִִִִִִִִִִִִִִִִ0ִִִִִִִִִִ0ִִִִִִִִִִ0
PKEY_1_101ִִִִִִX_SLICEִִִִִִִִִִִִִִִִִִִִ1ִִִִִִִִִִ1ִִִִִִִִִִ0
PKEY_1_001ִִִִִִX_SLICEִִִִִִִִִִִִִִִִִִִִ1ִִִִִִִִִִ1ִִִִִִִִִִ0
PKEY_1_XXXִִִִִִX_SLICEִִִִִִִִִִִִִִִִִִִִ0ִִִִִִִִִִ0ִִִִִִִִִִ0
PKEY_1_101ִִִִִִDATA1ִִִִִִִִִִִִִִִִ1000000ִִִִ.000001ִִִִִִִִִִ0
PKEY_1_001ִִִִִִDATA1ִִִִִִִִִִִִִִִִִִִִִ10ִִִִִִִִִ.1ִִִִִִִִִִ0
PKEY_1_XXXִִִִִִDATA1ִִִִִִִִִִִִִִִִִִִִִִ0ִִִִִִִִִִ0ִִִִִִִִִִ0
PKEY_0_XXXִִִִִִX_PKEYִִִִִִִִִִִִִִִִִִִִִ0ִִִִִִִִִִ0ִִִִִִִִִִ0
PKEY_0_001ִִִִִִX_PKEYִִִִִִִִִִִִִִִִִִִִִ0ִִִִִִִִִִ0ִִִִִִִִִִ0
PKEY_0_XXXִִִִִִX_SLICEִִִִִִִִִִִִִִִִִִִִ0ִִִִִִִִִִ0ִִִִִִִִִִ0
PKEY_0_001ִִִִִִX_SLICEִִִִִִִִִִִִִִִִִִִִ0ִִִִִִִִִִ0ִִִִִִִִִִ0
PKEY_0_XXXִִִִִִDATA1ִִִִִִִִִִִִִִִִִִִִִִ0ִִִִִִִִִִ0ִִִִִִִִִִ0
PKEY_0_001ִִִִִִDATA1ִִִִִִִִִִִִִִִִִִִִִִ0ִִִִִִִִִִ0ִִִִִִִִִִ0


15 rows selected.

SQL>
SQL> drop table join_table;

Table dropped.

SQL>
SQL> create table join_table (
2 data1 number not null,
3 data2 number
4 );

Table created.

SQL>
SQL> insert into join_table (data1, data2)
2 select seq, seq
3 from (
4 select level as seq from dual connect by level <= 1000
5 );

1000 rows created.

SQL>
SQL> create unique index join_table_pk on join_table (data1);

Index created.

SQL>
SQL> begin dbms_stats.gather_table_stats(ownname=>USER,tabname=>'JOIN_TABLE', cascade=>true); end;
2 /

PL/SQL procedure successfully completed.

SQL>
SQL> begin dbms_stats.set_table_stats(ownname=>USER,tabname=>'JOIN_TABLE',numrows=>10000000,numblks=>1000000); end;
2 /

PL/SQL procedure successfully completed.

SQL>
SQL> begin dbms_stats.set_index_stats(ownname=>USER,indname=>'JOIN_TABLE_PK',numrows=>10000000,numlblks=>100000,numdist=>10000000,clstfct=>1000000,indlevel=>3); end;
2 /

PL/SQL procedure successfully completed.

SQL>
SQL> declare
2 col_stat_rec dbms_stats.StatRec;
3 col_vals dbms_stats.numarray := dbms_stats.numarray(1, 10000000);
4 begin
5 col_stat_rec.epc := 2;
6 col_vals(1) := 1;
7 col_vals(2) := 10000000;
8 dbms_stats.prepare_column_values(col_stat_rec, col_vals);
9 dbms_stats.set_column_stats(ownname=>USER,tabname=>'JOIN_TABLE',colname=>'DATA1',distcnt=>10000000,density=>1/10000000, srec=>col_stat_rec);
10 end;
11 /

PL/SQL procedure successfully completed.

SQL>
SQL> select table_name, num_rows, blocks from user_tab_statistics where table_name = 'JOIN_TABLE';

TABLE_NAMEִִִִִִִִִִִִִNUM_ROWSִִִִִBLOCKS
--------------------ִ----------ִ----------
JOIN_TABLEִִִִִִִִִִִִִ10000000ִִִִ1000000


SQL>
SQL> select column_name, num_distinct, density, num_nulls from user_tab_col_statistics where table_name = 'JOIN_TABLE';

COLUMN_NAMEִִִִִNUM_DISTINCTִִִִDENSITYִִNUM_NULLS
---------------ִ------------ִ----------ִ----------
DATA1ִִִִִִִִִִִִִִִ10000000ִִִ.0000001ִִִִִִִִִִ0
DATA2ִִִִִִִִִִִִִִִִִִִ1000ִִִִִִִ.001ִִִִִִִִִִ0


SQL>
SQL> select table_name, index_name, num_rows, leaf_blocks, clustering_factor, distinct_keys, blevel
2 from user_ind_statistics where table_name = 'JOIN_TABLE';

TABLE_NAMEִִִִִִִִִִִINDEX_NAMEִִִִִִִִִִִִִNUM_ROWSִLEAF_BLOCKSִCLUSTERING_FACTORִDISTINCT_KEYSִִִִִBLEVEL
--------------------ִ--------------------ִ----------ִ-----------ִ-----------------ִ-------------ִ----------
JOIN_TABLEִִִִִִִִִִִJOIN_TABLE_PKִִִִִִִִִִ10000000ִִִִִִ100000ִִִִִִִִִִִ1000000ִִִִִִ10000000ִִִִִִִִִִ3


SQL>

Now we have two tables:

The range-list partitioned table is supposed to have a subpartition holding 10 records and another one holding 1.000.000 records. The remaining ones are left empty.

The large join table is supposed to have 10.000.000 records and a unique index on the join column.

Here is the plan when joining the main range partition holding the 1.000.010 records (both subpartitions included) to the large table:

SQL>
SQL> explain plan for select a.data1, b.data2 from partition_test2 partition (pkey_1) a, join_table b where a.data1 = b.data1;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1688927127

-------------------------------------------------------------------------------------------------------------------
|ִIdִִ|ִOperationִִִִִִִִִִִִִִִ|ִNameִִִִִִִִִִִִ|ִRowsִִ|ִBytesִ|TempSpc|ִCostִ(%CPU)|ִTimeִִִִִ|ִPstart|ִPstopִ|
-------------------------------------------------------------------------------------------------------------------
|ִִִ0ִ|ִSELECTִSTATEMENTִִִִִִִִ|ִִִִִִִִִִִִִִִִִ|ִִ1000K|ִִִִ10M|ִִִִִִִ|ִִִ223Kִִ(1)|ִ00:44:42ִ|ִִִִִִִ|ִִִִִִִ|
|*ִִ1ִ|ִִHASHִJOINִִִִִִִִִִִִִִ|ִִִִִִִִִִִִִִִִִ|ִִ1000K|ִִִִ10M|ִִִִ15M|ִִִ223Kִִ(1)|ִ00:44:42ִ|ִִִִִִִ|ִִִִִִִ|
|ִִִ2ִ|ִִִPARTITIONִRANGEִSINGLE|ִִִִִִִִִִִִִִִִִ|ִִ1000K|ִִ3906K|ִִִִִִִ|ִ19401ִִִ(1)|ִ00:03:53ִ|ִִִִִ2ִ|ִִִִִ2ִ|
|ִִִ3ִ|ִִִִPARTITIONִLISTִALLִִִ|ִִִִִִִִִִִִִִִִִ|ִִ1000K|ִִ3906K|ִִִִִִִ|ִ19401ִִִ(1)|ִ00:03:53ִ|ִִִִִ1ִ|ִִִִִ3ִ|
|ִִִ4ִ|ִִִִִTABLEִACCESSִFULLִִִ|ִPARTITION_TEST2ִ|ִִ1000K|ִִ3906K|ִִִִִִִ|ִ19401ִִִ(1)|ִ00:03:53ִ|ִִִִִ3ִ|ִִִִִ5ִ|
|ִִִ5ִ|ִִִTABLEִACCESSִFULLִִִִִ|ִJOIN_TABLEִִִִִִ|ִִִִ10M|ִִִִ66M|ִִִִִִִ|ִִִ193Kִִ(1)|ִ00:38:48ִ|ִִִִִִִ|ִִִִִִִ|
-------------------------------------------------------------------------------------------------------------------

PredicateִInformationִ(identifiedִbyִoperationִid):
---------------------------------------------------
ִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִ
ִִִ1ִ-ִaccess("A"."DATA1"="B"."DATA1")ִִִִִִִִִִִִִ


17 rows selected.

This seems to make sense, two large sets to join, full table scans with hash join looks reasonable.

So let's join now the very small subpartition to this large table, as it might happen in our real world scenario:

SQL>
SQL> explain plan for select a.data1, b.data2 from partition_test2 subpartition (pkey_1_001) a, join_table b where a.data1 = b.data1;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1183211070

------------------------------------------------------------------------------------------------------------------------
|ִIdִִ|ִOperationִִִִִִִִִִִִִִִִִִִִ|ִNameִִִִִִִִִִִִ|ִRowsִִ|ִBytesִ|TempSpc|ִCostִ(%CPU)|ִTimeִִִִִ|ִPstart|ִPstopִ|
------------------------------------------------------------------------------------------------------------------------
|ִִִ0ִ|ִSELECTִSTATEMENTִִִִִִִִִִִִִ|ִִִִִִִִִִִִִִִִִ|ִִ1000K|ִִִִ10M|ִִִִִִִ|ִִִ223Kִִ(1)|ִ00:44:42ִ|ִִִִִִִ|ִִִִִִִ|
|*ִִ1ִ|ִִHASHִJOINִִִִִִִִִִִִִִִִִִִ|ִִִִִִִִִִִִִִִִִ|ִִ1000K|ִִִִ10M|ִִִִ15M|ִִִ223Kִִ(1)|ִ00:44:42ִ|ִִִִִִִ|ִִִִִִִ|
|ִִִ2ִ|ִִִPARTITIONִCOMBINEDִITERATOR|ִִִִִִִִִִִִִִִִִ|ִִ1000K|ִִ3906K|ִִִִִִִ|ִ19401ִִִ(1)|ִ00:03:53ִ|ִִִKEYִ|ִִִKEYִ|
|ִִִ3ִ|ִִִִTABLEִACCESSִFULLִִִִִִִִִ|ִPARTITION_TEST2ִ|ִִ1000K|ִִ3906K|ִִִִִִִ|ִ19401ִִִ(1)|ִ00:03:53ִ|ִִִִִ4ִ|ִִִִִ4ִ|
|ִִִ4ִ|ִִִTABLEִACCESSִFULLִִִִִִִִִִ|ִJOIN_TABLEִִִִִִ|ִִִִ10M|ִִִִ66M|ִִִִִִִ|ִִִ193Kִִ(1)|ִ00:38:48ִ|ִִִִִִִ|ִִִִִִִ|
------------------------------------------------------------------------------------------------------------------------


PredicateִInformationִ(identifiedִbyִoperationִid):
---------------------------------------------------
ִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִ
ִִִ1ִ-ִaccess("A"."DATA1"="B"."DATA1")ִִִִִִִִִִִִִ


16 rows selected.

SQL>

Again this weird "partition combined iterator, key, key", and again a single partition access shown in the full table scan. But yet again, it refuses to accept that this single subpartition - according to the statistics - just holds 10 rows rather than 1.000.000 rows. Therefore its conclusion is: the cheapest operation is hash join / full table scan, which really hurts for the big table.

Let's assume the optimizer would take the correct statistics, what would happen then to the plan? Here's a proposal:

SQL> explain plan for select /*+ cardinality(a, 10) */ a.data1, b.data2 from partition_test2 subpartition (pkey_1_001) a, join_table b where a.data1 = b.data1;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2056129735

----------------------------------------------------------------------------------------------------------------
|ִIdִִ|ִOperationִִִִִִִִִִִִִִִִִִִִ|ִNameִִִִִִִִִִִִ|ִRowsִִ|ִBytesִ|ִCostִ(%CPU)|ִTimeִִִִִ|ִPstart|ִPstopִ|
----------------------------------------------------------------------------------------------------------------
|ִִִ0ִ|ִSELECTִSTATEMENTִִִִִִִִִִִִִ|ִִִִִִִִִִִִִִִִִ|ִִִִ10ִ|ִִִ110ִ|ִ19423ִִִ(1)|ִ00:03:54ִ|ִִִִִִִ|ִִִִִִִ|
|ִִִ1ִ|ִִNESTEDִLOOPSִִִִִִִִִִִִִִִִ|ִִִִִִִִִִִִִִִִִ|ִִִִ10ִ|ִִִ110ִ|ִ19423ִִִ(1)|ִ00:03:54ִ|ִִִִִִִ|ִִִִִִִ|
|ִִִ2ִ|ִִִPARTITIONִCOMBINEDִITERATOR|ִִִִִִִִִִִִִִִִִ|ִִִִ10ִ|ִִִִ40ִ|ִ19393ִִִ(1)|ִ00:03:53ִ|ִִִKEYִ|ִִִKEYִ|
|ִִִ3ִ|ִִִִTABLEִACCESSִFULLִִִִִִִִִ|ִPARTITION_TEST2ִ|ִִִִ10ִ|ִִִִ40ִ|ִ19393ִִִ(1)|ִ00:03:53ִ|ִִִִִ4ִ|ִִִִִ4ִ|
|ִִִ4ִ|ִִִTABLEִACCESSִBYִINDEXִROWID|ִJOIN_TABLEִִִִִִ|ִִִִִ1ִ|ִִִִִ7ִ|ִִִִִ3ִִִ(0)|ִ00:00:01ִ|ִִִִִִִ|ִִִִִִִ|
|*ִִ5ִ|ִִִִINDEXִUNIQUEִSCANִִִִִִִִִ|ִJOIN_TABLE_PKִִִ|ִִִִִ1ִ|ִִִִִִִ|ִִִִִ2ִִִ(0)|ִ00:00:01ִ|ִִִִִִִ|ִִִִִִִ|
----------------------------------------------------------------------------------------------------------------


PredicateִInformationִ(identifiedִbyִoperationִid):
---------------------------------------------------
ִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִ
ִִִ5ִ-ִaccess("A"."DATA1"="B"."DATA1")ִִִִִִִִִִִִִ


17 rows selected.

SQL>

This looks quite promising, and if a similar hint is applied to our real world example mentioned in the beginning, the serial statement execution run time drops from 15 minutes to just a couple of seconds. This looks like a vast improvement.

The Service Request is still being worked on, so stay tuned what Oracle Support has to say about this.

Update (August 7th, 2007): Oracle has accepted this as a bug (BUG5996801) and it has already been fixed for the upcoming version 11g. They are currently planning to include this fix also in
the next patch-set release 10.2.0.4 for Oracle 10gR2.

Saturday, June 30, 2007

Partitioned External Tables

If you ever have the need of accessing multiple similar, external files of differing sizes, and at the same time want to ensure that only those files are accessed that are actually required, here is an approach that could be interesting to you.

Before Oracle 8 introduced real partitioned tables so called "Partition Views" were the only option for support of table partitioning. It's quite interesting that the even the Oracle 10gR2 optimizer still supports this feature which has been deprecated long time ago. You can see this by looking at the execution plan shown below: It contains a line mentioning a "UNION-ALL PARTITION" operation which means that the optimizer has noticed that this is not a usual UNION ALL operator but is meant to be used as "Partition View" and therefore will be treated differently resp. special rules apply.

Using "Partition Views" you can add a kind of pseudo partitioning capabilities to external tables. This might be useful if you have external files that you want to treat similarly within Oracle but you frequently need to access only a part and not all of them.

I came recently across such a requirement where "external" feeds of very different sizes (starting from a few hundred records up to several millions) but of same layout should be loaded efficiently using a generic ETL process.

One potential solution I found should be shown here. The sample script creates three external tables. Important is that all of them have the same number and type of columns. This is one of the requirements that have to met in order to use the "Partition View" functionality. The interesting thing about external tables in this context is that the external files could be even of different physical layout given that the resulting Oracle column and datatype definition match.

On top of each of these three external tables a view is created that introduces the "partition" criteria of the underlying table. Finally the "Partition View" is created that unions all the partitions together using the "UNION ALL" operator. For the curious, the separate views on the external tables are not required, the "Partition View" can reference directly the external tables which might lower the maintenance effort if you need to work with many partitions. Another interesting thing about "Partition Views" is that you can create partition schemas that are impossible to implement using "real" partitioning, e.g. you could specify overlapping ranges in the "where" clauses of the views and the optimizer is then supposed to able to identify which of the underlying tables it needs to visit in order to fulfill the request.

The following sample shows further that you can also simulate a kind of list partitioning. If you check carefully the resulting execution plan you can see that the artificial filter expressions introduced by the optimizer ("NULL IS NOT NULL") efficiently prevent access of the table related to the filter expression. If you create the external table using the "logfile" option this can be proven very easily as Oracle appends (or creates) the logfile each time it accesses the external table. In this example you'll notice that only a single logfile is going to be created, which proves that the filter expression works as desired. It would be even nicer if the execution plan would show something like "1" or "0" rows as cardinality of the tables omitted. That would make it more obvious.

Using this approach you can handle all the external tables using the single "Partition View". If you apply suitable filter clauses when accessing the view the optimizer can efficiently prune the number of the underlying tables visited.

So in summary all the pro's and con's of "Partition Views" apply to this solution. One of con's is that you need to maintain the view definition whenever you want to modify the partition scheme, e.g. adding, dropping or merging/splitting a partition. In case you miss to keep the view definition in sync with the actual underlying tables your view will return incomplete or wrong data.

Among the pro's of the solution is that another feature can be implemented that unfortunately is not possible using real partitioning: You can mark each individual "partition" as "parallel" or "noparallel" as required. So small chunks can be accessed serially whereas large external tables could be created using the "parallel" option.

Here comes the script illustrating the solution and the corresponding required steps. It has been tested using Oracle 10.2.0.3.0 Enterprise Edition on Windows XP (32bit):

--ִcreateִanyִtextִasִfileִ"test.txt" in
[your_test_directory_goes_here]

CREATEִORִREPLACEִDIRECTORY
DATA_DIRִAS
'[your_test_directory_goes_here]';

dropִtableִext_table_test1ִpurge;

dropִtableִext_table_test2ִpurge;

dropִtableִext_table_test3ִpurge;

CREATEִTABLEִEXT_TABLE_TEST1
(
ִִTEXTִִVARCHAR2(4000ִBYTE)
)
ORGANIZATIONִEXTERNAL
ִִ(ִִTYPEִORACLE_LOADER
ִִִִִDEFAULTִDIRECTORYִDATA_DIR
ִִִִִACCESSִPARAMETERS
ִִִִִִִ(ִrecordsִdelimitedִbyִnewline
ִִִִnobadfile
ִִִִnodiscardfile
ִִִִlogfileִ'ext_table_test1.log'
ִִִִ)
ִִִִִLOCATIONִ(DATA_DIR:'test.txt')
ִִ)
REJECTִLIMITִUNLIMITED
NOPARALLEL;

CREATEִTABLEִEXT_TABLE_TEST2
(
ִִTEXTִִVARCHAR2(4000ִBYTE)
)
ORGANIZATIONִEXTERNAL
ִִ(ִִTYPEִORACLE_LOADER
ִִִִִDEFAULTִDIRECTORYִDATA_DIR
ִִִִִACCESSִPARAMETERS
ִִִִִִִ(ִrecordsִdelimitedִbyִnewline
ִִִִnobadfile
ִִִִnodiscardfile
ִִִִlogfileִ'ext_table_test2.log'
ִִִִ)
ִִִִִLOCATIONִ(DATA_DIR:'test.txt')
ִִ)
REJECTִLIMITִUNLIMITED
NOPARALLEL;

CREATEִTABLEִEXT_TABLE_TEST3
(
ִִTEXTִִVARCHAR2(4000ִBYTE)
)
ORGANIZATIONִEXTERNAL
ִִ(ִִTYPEִORACLE_LOADER
ִִִִִDEFAULTִDIRECTORYִDATA_DIR
ִִִִִACCESSִPARAMETERS
ִִִִִִִ(ִrecordsִdelimitedִbyִnewline
ִִִִnobadfile
ִִִִnodiscardfile
ִִִִlogfileִ'ext_table_test3.log'
ִִִִ)
ִִִִִLOCATIONִ(DATA_DIR:'test.txt')
ִִ)
REJECTִLIMITִUNLIMITED
NOPARALLEL;

createִorִreplaceִviewִv_ext_table_test1ִasִselectִ*ִfromִext_table_test1ִwhereִtextִ=ִ'ABC';

createִorִreplaceִviewִv_ext_table_test2ִasִselectִ*ִfromִext_table_test2ִwhereִtextִ=ִ'DEF';

createִorִreplaceִviewִv_ext_table_test3ִasִselectִ*ִfromִext_table_test3ִwhereִtextִ=ִ'GHI';

createִorִreplaceִviewִv_ext_table_allִas
selectִ*ִfromִv_ext_table_test1
unionִall
selectִ*ִfromִv_ext_table_test2
unionִall
selectִ*ִfromִv_ext_table_test3;

/*ִaboveִcanִbeִmergedִintoִoneִsingleִview
CREATEִORִREPLACEִVIEWִv_ext_table_allִ(
ִִtext
)ִAS
selectִ"TEXT"ִfromִext_table_test1ִwhereִtextִ=ִ'ABC'
unionִall
selectִ"TEXT"ִfromִext_table_test2ִwhereִtextִ=ִ'DEF'
unionִall
selectִ"TEXT"ִfromִext_table_test3ִwhereִtextִ=ִ'GHI'
/
*/

selectִ*ִfromִv_ext_table_allִwhereִtextִ=ִ'DEF';

--ִTheִplan
--
--ִִPlanִhashִvalue:ִ3090276507
--
---------------------------------------------------------------------------------------------------
--|ִIdִִ|ִOperationִִִִִִִִִִִִִִִִִִִִִ|ִNameִִִִִִִִִִִִ|ִRowsִִ|ִBytesִ|ִCostִ(%CPU)|ִTimeִִִִִ|
---------------------------------------------------------------------------------------------------
--|ִִִ0ִ|ִSELECTִSTATEMENTִִִִִִִִִִִִִִ|ִִִִִִִִִִִִִִִִִ|ִִִִ96ִ|ִִִ187K|ִִִִ68ִִִ(2)|ִ00:00:01ִ|
--|ִִִ1ִ|ִִVIEWִִִִִִִִִִִִִִִִִִִִִִִִִ|ִV_EXT_TABLE_ALLִ|ִִִִ96ִ|ִִִ187K|ִִִִ68ִִִ(2)|ִ00:00:01ִ|
--|ִִִ2ִ|ִִִUNION-ALLִPARTITIONִִִִִִִִִ|ִִִִִִִִִִִִִִִִִ|ִִִִִִִ|ִִִִִִִ|ִִִִִִִִִִִִ|ִִִִִִִִִִ|
--|*ִִ3ִ|ִִִִFILTERִִִִִִִִִִִִִִִִִִִִִ|ִִִִִִִִִִִִִִִִִ|ִִִִִִִ|ִִִִִִִ|ִִִִִִִִִִִִ|ִִִִִִִִִִ|
--|*ִִ4ִ|ִִִִִEXTERNALִTABLEִACCESSִFULL|ִEXT_TABLE_TEST1ִ|ִִִִ82ִ|ִִִ160K|ִִִִ24ִִִ(0)|ִ00:00:01ִ|
--|*ִִ5ִ|ִִִִEXTERNALִTABLEִACCESSִFULLִ|ִEXT_TABLE_TEST2ִ|ִִִִ82ִ|ִִִ160K|ִִִִ24ִִִ(0)|ִ00:00:01ִ|
--|*ִִ6ִ|ִִִִFILTERִִִִִִִִִִִִִִִִִִִִִ|ִִִִִִִִִִִִִִִִִ|ִִִִִִִ|ִִִִִִִ|ִִִִִִִִִִִִ|ִִִִִִִִִִ|
--|*ִִ7ִ|ִִִִִEXTERNALִTABLEִACCESSִFULL|ִEXT_TABLE_TEST3ִ|ִִִִ82ִ|ִִִ160K|ִִִִ24ִִִ(0)|ִ00:00:01ִ|
---------------------------------------------------------------------------------------------------
--
--PredicateִInformationִ(identifiedִbyִoperationִid):
-----------------------------------------------------
--
--ִִִ3ִ-ִfilter(NULLִISִNOTִNULL)
--ִִִ4ִ-ִfilter("TEXT"='ABC')
--ִִִ5ִ-ִfilter("TEXT"='DEF')
--ִִִ6ִ-ִfilter(NULLִISִNOTִNULL)
--ִִִ7ִ-ִfilter("TEXT"='GHI')

Wednesday, May 16, 2007

National character set and string literals

Until the arrival of Oracle 10gR2 there was an quite interesting limitation regarding string literals and the national character set: If you were using a non-unicode database character set (e.g. in Europe usually WE8ISO8859P1 or WE8ISO8859P15) and attempted to put a character not being part of the database character set into a column of type NCHAR/NVARCHAR/NCLOB using a string literal, you wouldn't succeed. This was because the SQL parser always converted the whole SQL string into the database character set before parsing and executing the statement.

Although some versions ago Oracle introduced the N' as national character set string literal modifier, it didn't really help due to the limitation mentioned above. So even string literals marked with N' were converted into the database character set by the SQL parser.

10gR2 introduces a new OCI mode and a corresponding environment variable which can be set to enable the new mode for OCI applications that do not support turning it on explicitly. Using a 10gR2 client and server, you can set the following environment variable:

ORA_NCHAR_LITERAL_REPLACE=TRUE

before starting the OCI application, e.g. SQL*Plus. You need to enable this new mode explicitly, by default it is disabled (for backwards compatibility, they say).

So, only if these three requirements are met:

1. 1ogR2 server
2. 10gR2 client
3. Environment variable ORA_NCHAR_LITERAL_REPLACE=TRUE is set on client before starting OCI application

then the above mentioned limitation is lifted and you are actually able to put characters only available in the national character set into the database by using a N' string literal.

Note that you need to use the N' modifier for the string literal, otherwise a string literal will be converted into the database character set, and meeting the requirements mentioned above will not help either.

Here is a small script that demonstrates the issue. It was tested in the following environment:

Server:
10gR2 10.2.0.2 Enterprise Edition on Linux (32bit)
Database character set WE8ISO8859P1
National character set AL16UTF16

Client:
10gR2 10.2.0.3 Windows XP (32bit)
Client character set WE8MSWIN1252

The script:

create table test_nls1(text1 nclob);

create table test_nls2(text2 clob);

insert into test_nls1(text1) values (N'äöü@@ßß•');

insert into test_nls2(text2) values (N'äöü@@ßß•');

commit;

insert into test_nls1(text1) values ('äöü@@ßß•');

insert into test_nls2(text2) values ('äöü@@ßß•');

commit;

If you use a application that is capable of showing the character accordingly (funny enough SQL*PlusW, the windows version of SQL*Plus does not, although it can be used to run the script, it shows the last character as "block"), e.g. the Java based Oracle tool SQLDeveloper (which is in turn not capable of storing the data correctly in the database, at least it didn't work for me, may be some settings need to adjusted), you'll find out that only the first of the four inserts was actually successful in storing the last character of the text literal correctly in the database. And it will only be successful if you met the three requirements mentioned above.

Note that the last character of the string literal is a Windows specific ANSI character that is not part of the WE8ISO8859P1 database character set. So if you attempt to insert this string into a normal text column the character set conversion will replace this character with one of the usual "replacement" characters like "¿".

Please note further that you need a database that is using a different database character set than WE8MSWIN1252. This character set is typically used by default databases created under Windows, so be aware of the fact that your results will be different if your database uses this character or if your database uses the same character set as your client (because in this case no conversion at all will take place).

And one final note: My first attempts to test this new functionality failed when using the "Instant Client 10.2.0.1" on Windows XP (32bit). It just didn't seem to use the new setting of the environment variable, I never succeeded in putting the string literal correctly into the NCLOB column.


You can find further details about this topic at the following web locations:

MetaLink Knowledge Base: Character Sets & Conversion - Frequently Asked Questions Doc ID: 227330.1
Oracle 10gR2 Online Documentation - Globalization Support Guide - Programming with Unicode

Tuesday, April 24, 2007

Performing PL/SQL functions in parallel which are not "pure"

If you find out that a particular process is CPU bound that involves calling user-defined PL/SQL functions, then one of the measures you might want to consider - if your hardware is powerful enough - is executing your statement in parallel and in particular execute the PL/SQL function in parallel, which might not always be the same.

It is possible in Oracle that you run a statement in parallel but the execution of the PL/SQL function is performed by the statement coordinator process. In that case you probably won't achieve those improvements that you expected from parallel execution, because it is very likely that the PL/SQL function is actually using most of the CPU time and by being executed by the statement coordinator process it is still being executed serially although there are parallel slaves performing some of the work, e.g. scanning a particular part of a table/partition segment, but those are just queued up waiting to deliver data to the coordinator process which is busy executing the PL/SQL function.

Oracle executes a user-defined PL/SQL function in parallel only if it is a "pure" function. "Pure" means that it does not depend on a package state (it does not read or write variables defined on package/session level, in addition a really "pure" function should not read or write the database either, but here we focus on the package/session state/variables). The "purity" of a function can be determined in several ways:
  • If it is a stand-alone PL/SQL function, Oracle is able to determine by itself (I think since Orace 8i) whether the function is pure or not. This does not apply to functions defined within a package
  • If it is a package function or you want to make sure that your function is "pure", you can use the compiler directive "pragma restrict_references" to declare the function as pure by specifying that it is not allowed to read or write the package state. If your function attempts to access variables defined on package level the compiler will show corresponding errors and your package will fail to compile (unless you use the option "TRUST" of the pragma).
  • You can tell Oracle to "trust" you by specifying "parallel_enable" as part of the function declaration in the package specification. In this case Oracle does not enforce anything, it will execute your function in parallel and you are responsible that the function does not screw up when being executed in parallel.

The reason for all these measures is that Oracle implements parallel execution by spreading the execution of the statement across separate sessions which do not share the package/session context, which means that a variable that has been initialised in the statement coordinator session with a particular value will not be available to the parallel slaves being executed as child processes of the statement coordinator process. So in case your user-defined function depends on variables defined on package scope they will run into trouble when executed in parallel since each parallel execution slave has its own non-shared session environment in which the variables at package level will be initialised separately from each other, so each parallel execution slaves potentially ends up with blank or different values of the package level variables.

Now what options do you have if you need to use a user-defined PL/SQL function that depends on information that can not be passed as parameters but needs to be defined in your session before the actual statement is executed in parallel?

First thing to consider is of course, if it is not possible to pass the required information as parameter, so you don't have the need to read additional information from anywhere else.

If that is not an option then two potential solutions are presented here. The first uses a database table to pass information to the parallel slaves, the second uses the concept of contexts.

Let's have a closer look at approach number one.

The key concept behind the first approach is that the information that is supposed to be made available to the user-defined PL/SQL functions executed in the parallel slaves is stored in a database table and the PL/SQL function then needs to be able to identify which particular set of data in the table it has been assigned to use.

This can be achieved by using the SID (or more generally the INSTANCE_ID plus SID for RAC environments) of the statement coordinator session as unique identifier in the table. So the PL/SQL function only needs to find out the SID and INSTANCE_ID of the parent process to determine which rows of the parameter table to use.

The parameter table could have the following generic layout:

CREATE TABLE PX_PARAMETER (
COORDINATOR_ID VARCHAR2(20) NOT NULL,
PARAMETER_NAME VARCHAR2(128) NOT NULL,
PARAMETER_VALUE VARCHAR2(4000),
CONSTRAINT PX_PARAMETER_PK PRIMARY KEY (COORDINATOR_ID, PARAMETER_NAME));

In order to find out what the coordinator id of the parallel slave's parent is, the following information can be used from V$SESSION (the following description has been taken from the Oracle documentation, "Reference" and applies to Oracle 9i and 10g):

Column OWNERID: "The column contents are invalid if the value is 2147483644. Otherwise, this column contains the identifier of the user who owns the migratable session.
For operations using Parallel Slaves, interpret this value as a 4-byte value. The low-order 2 bytes of which represent the session number, and the high-order bytes the instance ID of the query coordinator"

So by shifting the bits a bit around in OWNERID, the parallel slave should be able to determine the SID and INSTANCE_ID of its parent process and can use this information to query the parameter table accordingly.

Here is a sample script showing the whole stuff in action:

SQL>ִ
SQL>ִdropִtableִtest_source_dataִpurge;

Tableִdropped.

SQL>ִdropִtableִtest_dest_dataִpurge;

Tableִdropped.

SQL>ִdropִpackageִpk_px_parameter_test;

Packageִdropped.

SQL>ִdropִpackageִpk_px_parameter;

Packageִdropped.

SQL>ִdropִtableִpx_parameterִpurge;

Tableִdropped.

SQL>ִ
SQL>ִCREATEִTABLEִPX_PARAMETERִ(
ִִ2ִִCOORDINATOR_IDִVARCHAR2(20)ִNOTִNULL,
ִִ3ִִPARAMETER_NAMEִVARCHAR2(128)ִNOTִNULL,
ִִ4ִִPARAMETER_VALUEִVARCHAR2(4000),
ִִ5ִִCONSTRAINTִPX_PARAMETER_PKִPRIMARYִKEYִ(COORDINATOR_ID,ִPARAMETER_NAME));

Tableִcreated.

SQL>ִ
SQL>ִcreateִorִreplaceִpackageִpk_px_parameterִas
ִִ2ִִִִ--ִutilityִfunctions
ִִ3ִִִִfunctionִget_my_sidִreturnִnumberִdeterministicִparallel_enable;
ִִ4ִִִִfunctionִget_my_inst_idִreturnִvarchar2ִdeterministicִparallel_enable;
ִִ5ִִִִ--ִcallִthisִprocedureִinִyourִstatementִcoordinatorִsessionִtoִsetִaִparameterִbefore
ִִ6ִִִִ--ִexecutingִaִstatementִinִparallel
ִִ7ִִִִ--ִitִdoesִnotִcommitִitsִwork,ִyouִneedִtoִdoִitִyourself
ִִ8ִִִִprocedureִset_px_parameter(in_s_parameter_nameִinִvarchar2,ִin_s_parameter_valueִinִvarchar2);
ִִ9ִִִִ--ִcallִthisִfunctionִinִyourִuser-definedִpl/sqlִfunctionִtoִgetִaִparameterִvalue
ִ10ִִִִ--ִshouldִworkִinִbothִparallelִorִserialִmode
ִ11ִִִִfunctionִget_px_parameter(in_s_parameter_nameִinִvarchar2)ִreturnִvarchar2ִparallel_enable;
ִ12ִִendִpk_px_parameter;
ִ13ִִ/

Packageִcreated.

SQL>ִ
SQL>ִcreateִorִreplaceִpackageִbodyִpk_px_parameterִas
ִִ2ִִִִfunctionִget_my_sidִreturnִnumberִdeterministicִparallel_enableִis
ִִ3ִִ
ִִִִn_my_sidִnumber;
ִִ4ִִִִbegin
ִִ5ִִ ִ
ִִselectִsidִintoִn_my_sidִfromִv$mystatִwhereִrownumִ=ִ1;
ִִ6ִִ ִ
ִִreturnִn_my_sid;
ִִ7ִִִִendִget_my_sid;
ִִ8ִִ
ִִ9ִִִִfunctionִget_my_inst_idִreturnִvarchar2ִdeterministicִparallel_enableִis
ִ10ִִ ִ
ִִs_my_inst_idִvarchar2(30);
ִ11ִִִִbegin
ִ12ִִ ִ
ִִs_my_inst_idִ:=ִsys_context('USERENV',ִ'INSTANCE');
ִ13ִִ ִ
ִִreturnִs_my_inst_id;
ִ14ִִִִendִget_my_inst_id;
ִ15ִִ
ִ16ִִִִfunctionִget_my_idִreturnִvarchar2ִdeterministicִis
ִ17ִִ ִ
ִִn_my_sidִnumber;
ִ18ִִ ִ
ִִs_my_inst_idִvarchar2(30);
ִ19ִִ ִ
ִִs_my_idִvarchar2(50);
ִ20ִִִִbegin
ִ21ִִ ִ
ִִn_my_sidִ:=ִget_my_sid;
ִ22ִִ ִ
ִִs_my_inst_idִ:=ִget_my_inst_id;
ִ23ִִ ִ
ִִs_my_idִ:=ִs_my_inst_idִ||ִ'|'ִ||ִto_char(n_my_sid,ִ'TM');
ִ24ִִ
ִ25ִִ ִ
ִִreturnִs_my_id;
ִ26ִִִִendִget_my_id;
ִ27ִִ
ִ28ִִִִfunctionִget_coordinator_idִreturnִvarchar2ִdeterministicִis
ִ29ִִ ִ
ִִn_my_sidִnumber;
ִ30ִִ ִ
ִִn_owneridִnumber;
ִ31ִִ ִ
ִִn_parent_sidִnumber;
ִ32ִִ ִ
ִִn_parent_inst_idִnumber;
ִ33ִִ ִ
ִִs_coordinator_idִvarchar2(50);
ִ34ִִִִbegin
ִ35ִִ ִ
ִִn_my_sidִ:=ִget_my_sid;
ִ36ִִ ִ
ִִSELECTִownerid,
ִ37ִִ ִ
ִִround(bitand(ownerid,ִ65535))ִasִparent_session_sid,
ִ38ִִ ִ
ִִround(bitand(ownerid,ִ16711680)ִ/ִ65536)ִasִparent_session_instid
ִ39ִִ ִ
ִִintoִn_ownerid,ִn_parent_sid,ִn_parent_inst_id
ִ40ִִ ִ
ִִfromִv$sessionִwhereִsidִ=ִn_my_sid;
ִ41ִִ
ִ42ִִ ִ
ִִifִn_owneridִ=ִ2147483644ִthen
ִ43ִִ ִִִ
ִִ--ִnoִparallelִexecution,ִuseִcurrentִinst_idִandִsid
ִ44ִִ ִִִ
ִִs_coordinator_idִ:=ִget_my_inst_idִ||ִ'|'ִ||ִto_char(n_my_sid,ִ'TM');
ִ45ִִ ִ
ִִelse
ִ46ִִ ִִִ
ִִ--ִparallelִexecution,ִuseִparentִinfo
ִ47ִִ ִִִ
ִִs_coordinator_idִ:=ִto_char(n_parent_inst_id,ִ'TM')ִ||ִ'|'ִ||ִto_char(n_parent_sid,ִ'TM');
ִ48ִִ ִ
ִִendִif;
ִ49ִִ
ִ50ִִ ִ
ִִreturnִs_coordinator_id;
ִ51ִִִִendִget_coordinator_id;
ִ52ִִ
ִ53ִִִִ--ִcallִthisִprocedureִinִyourִstatementִcoordinatorִsessionִtoִsetִaִparameterִbefore
ִ54ִִִִ--ִexecutingִaִstatementִinִparallel
ִ55ִִִִ--ִitִdoesִnotִcommitִitsִwork,ִyouִneedִtoִdoִitִyourself
ִ56ִִִִprocedureִset_px_parameter(in_s_parameter_nameִinִvarchar2,ִin_s_parameter_valueִinִvarchar2)ִis
ִ57ִִ ִ
ִִs_my_idִvarchar2(50);
ִ58ִִִִbegin
ִ59ִִ ִ
ִִs_my_idִ:=ִget_my_id;
ִ60ִִ
ִ61ִִ ִ
ִִinsertִintoִpx_parameterִ(coordinator_id,ִparameter_name,ִparameter_value)ִvaluesִ(s_my_id,ִin_s_parameter_name,ִin_s_parameter_value);
ִ62ִִִִendִset_px_parameter;
ִ63ִִִִ--ִcallִthisִfunctionִinִyourִuser-definedִpl/sqlִfunctionִtoִgetִaִparameterִvalue
ִ64ִִִִ--ִshouldִworkִinִbothִparallelִorִserialִmode
ִ65ִִִִfunctionִget_px_parameter(in_s_parameter_nameִinִvarchar2)ִreturnִvarchar2ִparallel_enableִis
ִ66ִִ ִ
ִִs_coordinator_idִvarchar2(50);
ִ67ִִ ִ
ִִs_parameter_valueִvarchar2(4000);
ִ68ִִִִbegin
ִ69ִִ ִ
ִִs_coordinator_idִ:=ִget_coordinator_id;
ִ70ִִ
ִ71ִִ ִ
ִִselectִparameter_value
ִ72ִִ ִ
ִִintoִs_parameter_value
ִ73ִִ ִ
ִִfromִpx_parameter
ִ74ִִ ִ
ִִwhereִcoordinator_idִ=ִs_coordinator_id
ִ75ִִ ִ
ִִandִparameter_nameִ=ִin_s_parameter_name;
ִ76ִִ
ִ77ִִ ִ
ִִreturnִs_parameter_value;
ִ78ִִִִendִget_px_parameter;
ִ79ִִendִpk_px_parameter;
ִ80ִִ/

Packageִbodyִcreated.

SQL>ִ
SQL>ִcreateִtableִtest_source_dataִparallelִnologgingִasִselectִ*ִfromִall_objects;

Tableִcreated.

SQL>ִ
SQL>ִselectִcount(*)ִfromִtest_source_data;

ִִCOUNT(*)
----------
ִִִִִ98440

SQL>ִ
SQL>ִcreateִtableִtest_dest_dataִ(
ִִ2ִִthe_sidִnumberִnotִnull,
ִִ3ִִthe_test_dataִvarchar2(4000)ִnotִnull)
ִִ4ִִparallelִnologging;

Tableִcreated.

SQL>ִ
SQL>ִalterִsessionִenableִparallelִdml;

Sessionִaltered.

SQL>ִ
SQL>ִbeginִpk_px_parameter.set_px_parameter('THE_DATA',ִ'Thisִis
ִreadִbyִtheִparallelִslave!');ִend;
ִִ2ִִ/

PL/SQLִprocedureִsuccessfullyִcompleted.

SQL>ִ
SQL>ִcolumnִparameter_nameִformatִa20
SQL>ִcolumnִparameter_valueִformatִa40
SQL>ִ
SQL>ִselectִ*ִfromִpx_parameter;

COORDINATOR_IDִִִִִִִPARAMETER_NAMEִִִִִִִPARAMETER_VALUE
--------------------ִ--------------------ִ----------------------------------------
1|124ִִִִִִִִִִִִִִִִTHE_DATAִִִִִִִִִִִִִThisִis
ִreadִby theִparallelִslave!

SQL>ִ
SQL>ִcommit;

Commitִcomplete.

SQL>ִ
SQL>ִcreateִorִreplaceִpackageִpk_px_parameter_testִis
ִִ2ִִִִfunctionִget_the_sidִreturnִnumberִdeterministicִparallel_enable;
ִִ3ִִִִfunctionִget_the_test_dataִreturnִvarchar2ִdeterministicִparallel_enable;
ִִ4ִִendִpk_px_parameter_test;
ִִ5ִִ/

Packageִcreated.

SQL>ִ
SQL>ִcreateִorִreplaceִpackageִbodyִpk_px_parameter_testִis
ִִ2ִִִִg_n_the_sidִnumber;
ִִ3ִִִִg_s_the_test_dataִvarchar2(4000);
ִִ4ִִ
ִִ5ִִִִfunctionִget_the_sidִreturnִnumberִdeterministicִparallel_enableִis
ִִ6ִִִִbegin
ִִ7ִִ ִ
ִִifִg_n_the_sidִisִnullִthen
ִִ8ִִ ִִִ
ִִg_n_the_sidִ:=ִpk_px_parameter.get_my_sid;
ִִ9ִִ ִ
ִִendִif;
ִ10ִִ
ִ11ִִ ִ
ִִreturnִg_n_the_sid;
ִ12ִִִִendִget_the_sid;
ִ13ִִ
ִ14ִִִִfunctionִget_the_test_dataִreturnִvarchar2ִdeterministicִparallel_enableִis
ִ15ִִִִbegin
ִ16ִִ ִ
ִִifִg_s_the_test_dataִisִnullִthen
ִ17ִִ ִִִ
ִִg_s_the_test_dataִ:=ִ'I''mִsession:ִ'ִ||ִpk_px_parameter.get_my_sidִ||ִ'ִandִIִgotִthis:ִ'ִ||ִpk_px_parameter.get_px_parameter('THE_DATA');
ִ18ִִ ִ
ִִendִif;
ִ19ִִ
ִ20ִִ ִ
ִִreturnִg_s_the_test_data;
ִ21ִִִִend;
ִ22ִִendִpk_px_parameter_test;
ִ23ִִ/

Packageִbodyִcreated.

SQL>ִ
SQL>ִinsertִ/*+ִappendִparallel(d)ִ*/ִintoִtest_dest_dataִdִ(the_sid,ִthe_test_data)
ִִ2ִִselectִpk_px_parameter_test.get_the_sid,ִpk_px_parameter_test.get_the_test_data
ִִ3ִִfromִtest_source_data;

98440ִrowsִcreated.

SQL>ִ
SQL>ִcommit;

Commitִcomplete.

SQL>ִ
SQL>ִcolumnִthe_test_dataִformatִa40
SQL>ִ
SQL>ִselectִthe_sid,ִthe_test_data,ִcount(*)ִfromִtest_dest_data
ִִ2ִִgroupִbyִthe_sid,ִthe_test_data;

ִִִTHE_SIDִTHE_TEST_DATAִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִCOUNT(*)
----------ִ----------------------------------------ִ----------
ִִִִִִִ117ִI'mִsession:ִ117ִandִIִgotִthis:ִThisִisִִִִִִ52910
ִִִִִִִִִִִread
ִbyִtheִparallelִslave!

ִִִִִִִ142ִI'mִsession:ִ142ִandִIִgotִthis:ִThisִisִִִִִִ45530
ִִִִִִִִִִִread
ִbyִtheִparallelִslave!


SQL>ִ
SQL>ִspoolִoff

As you can see, the parallel slaves were able to pick up the data which has been stored by the coordinator process prior to executing the parallel DML.

The provided package PK_PX_PARAMETER can be used as a starting point for your own development. It lacks features like exception handling, but it shows the basic functionality required to set and get parameters used in parallel slaves. It requires SELECT privileges on V$MYSTAT and V$SESSION at least. As always with AUTH_ID DEFINER packages, these privileges need to be granted directly and explicitly to the user owning the packages, granting a role won't be sufficient.

Additionally, the PX_PARAMETER_TEST package shows how to cache information obtained via PK_PX_PARAMETER in order to prevent the parallel execution from causing excessive latching/logical I/Os by executing the recursive SQL to read the data from the table PX_PARAMETER over and over again. Of course, in a real world application you need to ensure that you are able tell whether the data your cache holds is outdated or not in case the already established parallel slave sessions are going to be reused. One possible approach is to mark each of your operations with a unique "run_id" (could be stored in a context, see below for description how to use this), and if the current run_id is different from your cached run_id then you can discard your current cache and re-load the actual data into the cache.

Let's turn to the the second approach now.

A different approach which can be used if the data to be passed to the PL/SQL function is not too large is by using a context.

Since Version 8i Oracle offers the concept of contexts which is basically a user defined namespace where you can define an arbitrary number of string variables which can be set via DBMS_SESSION.SET_CONTEXT and accessed via the SYS_CONTEXT() function. Contexts are mainly used in conjunction with Row-Level Security, but they can be used on their own as well.

In 10g, the variables defined in the context namespace on statement coordinator level will be propagated to the parallel execution slaves, so all you need to do is set your variables and read them in your user-defined PL/SQL function being executed in parallel. It should be able to access the variables, although they have been defined in the parent session.

In 9i, the official documentation (Application Developer's Guide - Fundamentals, Chapter 12: "Implementing Application Security Policies") says the following:

"If SYS_CONTEXT is used inside a SQL function which is embedded in a parallel query, the function cannot pick up the application context. This is true because the application context exists only in the user session. To use these features in combination, you must call SYS_CONTEXT directly from the query."

This seems to be outdated resp. a documentation bug, because when testing the 10g version with 9.2.0.8, it worked exactly the same, so the context information was propagated to the parallel execution slaves. Note: Tests with 9.2.0.2 terminated with a "Parallel slave unexpectedly died" error message when accessing the context information in the parallel slave, so make sure that you run a current version of 9iR2 if you plan to use this.

Here is a sample script showing the second approach. It does not need a table to store the parameters, but of course requires more memory, since your parameter data is held in the variables of the namespace defined by the context.

SQL>ִ
SQL>ִdropִtableִtest_source_dataִpurge;

Tableִdropped.

SQL>ִdropִtableִtest_dest_dataִpurge;

Tableִdropped.

SQL>ִdropִpackageִpk_px_parameter_test;

Packageִdropped.

SQL>ִdropִpackageִpk_px_parameter;

Packageִdropped.

SQL>ִdropִcontextִct_px_parameter;

Contextִdropped.

SQL>ִ
SQL>ִcreateִorִreplaceִcontextִct_px_parameterִusingִpk_px_parameter;

Contextִcreated.

SQL>ִ
SQL>ִcreateִorִreplaceִpackageִpk_px_parameterִas
ִִ2ִִִִ--ִutilityִfunctions
ִִ3ִִִִfunctionִget_my_sidִreturnִnumberִdeterministicִparallel_enable;
ִִ4ִִִִfunctionִget_my_inst_idִreturnִvarchar2ִdeterministicִparallel_enable;
ִִ5ִִִִ--ִcallִthisִprocedureִinִyourִstatementִcoordinatorִsessionִtoִsetִaִparameterִbefore
ִִ6ִִִִ--ִexecutingִaִstatementִinִparallel
ִִ7ִִִִ--ִitִdoesִnotִcommitִitsִwork,ִyouִneedִtoִdoִitִyourself
ִִ8ִִִִprocedureִset_px_parameter(in_s_parameter_nameִinִvarchar2,ִin_s_parameter_valueִinִvarchar2);
ִִ9ִִִִ--ִcallִthisִfunctionִinִyourִuser-definedִpl/sqlִfunctionִtoִgetִaִparameterִvalue
ִ10ִִִִ--ִshouldִworkִinִbothִparallelִorִserialִmode
ִ11ִִִִfunctionִget_px_parameter(in_s_parameter_nameִinִvarchar2)ִreturnִvarchar2ִparallel_enable;
ִ12ִִendִpk_px_parameter;
ִ13ִִ/

Packageִcreated.

SQL>ִ
SQL>ִcreateִorִreplaceִpackageִbodyִpk_px_parameterִas
ִִ2ִִִִfunctionִget_my_sidִreturnִnumberִdeterministicִparallel_enableִis
ִִ3ִִ ִ
ִִn_my_sidִnumber;
ִִ4ִִִִbegin
ִִ5ִִ ִ
ִִselectִsidִintoִn_my_sidִfromִv$mystatִwhereִrownumִ=ִ1;
ִִ6ִִ ִ
ִִreturnִn_my_sid;
ִִ7ִִִִendִget_my_sid;
ִִ8ִִ
ִִ9ִִִִfunctionִget_my_inst_idִreturnִvarchar2ִdeterministicִparallel_enableִis
ִ10ִִ ִ
ִִs_my_inst_idִvarchar2(30);
ִ11ִִִִbegin
ִ12ִִ ִ
ִִs_my_inst_idִ:=ִsys_context('USERENV',ִ'INSTANCE');
ִ13ִִ ִ
ִִreturnִs_my_inst_id;
ִ14ִִִִendִget_my_inst_id;
ִ15ִִ
ִ16ִִִִfunctionִget_my_idִreturnִvarchar2ִdeterministicִis
ִ17ִִ ִ
ִִn_my_sidִnumber;
ִ18ִִ ִ
ִִs_my_inst_idִvarchar2(30);
ִ19ִִ ִ
ִִs_my_idִvarchar2(50);
ִ20ִִִִbegin
ִ21ִִ ִ
ִִn_my_sidִ:=ִget_my_sid;
ִ22ִִ ִ
ִִs_my_inst_idִ:=ִget_my_inst_id;
ִ23ִִ ִ
ִִs_my_idִ:=ִs_my_inst_idִ||ִ'|'ִ||ִto_char(n_my_sid,ִ'TM');
ִ24ִִ
ִ25ִִ ִ
ִִreturnִs_my_id;
ִ26ִִִִendִget_my_id;
ִ27ִִ
ִ28ִִִִfunctionִget_coordinator_idִreturnִvarchar2ִdeterministicִis
ִ29ִִ ִ
ִִn_my_sidִnumber;
ִ30ִִ ִ
ִִn_owneridִnumber;
ִ31ִִ ִ
ִִn_parent_sidִnumber;
ִ32ִִ ִ
ִִn_parent_inst_idִnumber;
ִ33ִִ ִ
ִִs_coordinator_idִvarchar2(50);
ִ34ִִִִbegin
ִ35ִִ ִ
ִִn_my_sidִ:=ִget_my_sid;
ִ36ִִ ִ
ִִSELECTִownerid,
ִ37ִִ ִ
ִִround(bitand(ownerid,ִ65535))ִasִparent_session_sid,
ִ38ִִ ִ
ִִround(bitand(ownerid,ִ16711680)ִ/ִ65536)ִasִparent_session_instid
ִ39ִִ ִ
ִִintoִn_ownerid,ִn_parent_sid,ִn_parent_inst_id
ִ40ִִ ִ
ִִfromִv$sessionִwhereִsidִ=ִn_my_sid;
ִ41ִִ
ִ42ִִ ִ
ִִifִn_owneridִ=ִ2147483644ִthen
ִ43ִִ ִִִ
ִִ--ִnoִparallelִexecution,ִuseִcurrentִinst_idִandִsid
ִ44ִִ ִִִ
ִִs_coordinator_idִ:=ִget_my_inst_idִ||ִ'|'ִ||ִto_char(n_my_sid,ִ'TM');
ִ45ִִ ִ
ִִelse
ִ46ִִ ִִִ
ִִ--ִparallelִexecution,ִuseִparentִinfo
ִ47ִִ ִִִ
ִִs_coordinator_idִ:=ִto_char(n_parent_inst_id,ִ'TM')ִ||ִ'|'ִ||ִto_char(n_parent_sid,ִ'TM');
ִ48ִִ ִ
ִִendִif;
ִ49ִִ
ִ50ִִ ִ
ִִreturnִs_coordinator_id;
ִ51ִִִִendִget_coordinator_id;
ִ52ִִ
ִ53ִִִִ--ִcallִthisִprocedureִinִyourִstatementִcoordinatorִsessionִtoִsetִaִparameterִbefore
ִ54ִִִִ--ִexecutingִaִstatementִinִparallel
ִ55ִִִִ--ִitִdoesִnotִcommitִitsִwork,ִyouִneedִtoִdoִitִyourself
ִ56ִִִִprocedureִset_px_parameter(in_s_parameter_nameִinִvarchar2,ִin_s_parameter_valueִinִvarchar2)ִis
ִ57ִִ ִ
ִִ-- s_my_idִvarchar2(50);
ִ58ִִִִbegin
ִ59ִִ ִ
ִִ-- s_my_idִ:=ִget_my_id;
ִ60ִִ
ִ61ִִ ִ
ִִdbms_session.set_context('CT_PX_PARAMETER',ִin_s_parameter_name,ִin_s_parameter_value);
ִ62ִִִִendִset_px_parameter;
ִ63ִִִִ--ִcallִthisִfunctionִinִyourִuser-definedִpl/sqlִfunctionִtoִgetִaִparameterִvalue
ִ64ִִִִ--ִshouldִworkִinִbothִparallelִorִserialִmode
ִ65ִִִִfunctionִget_px_parameter(in_s_parameter_nameִinִvarchar2)ִreturnִvarchar2ִparallel_enableִis
ִ66ִִ ִ
ִִ-- s_coordinator_idִvarchar2(50);
ִ67ִִ ִs_parameter_valueִvarchar2(4000);
ִ68ִִִִbegin
ִ69ִִ ִ
ִִ-- s_coordinator_idִ:=ִget_coordinator_id;
ִ70ִִ
ִ71ִִ ִ
ִִs_parameter_valueִ:=ִsys_context('CT_PX_PARAMETER',ִin_s_parameter_name);
ִ72ִִ
ִ73ִִ ִ
ִִreturnִs_parameter_value;
ִ74ִִִִendִget_px_parameter;
ִ75ִִendִpk_px_parameter;
ִ76ִִ/

Packageִbodyִcreated.

SQL>ִ
SQL>ִcreateִtableִtest_source_dataִparallelִnologgingִasִselectִ*ִfromִall_objects;

Tableִcreated.

SQL>ִ
SQL>ִselectִcount(*)ִfromִtest_source_data;

ִִCOUNT(*)
----------
ִִִִִ98445

SQL>ִ
SQL>ִcreateִtableִtest_dest_dataִ(
ִִ2ִִthe_sidִnumberִnotִnull,
ִִ3ִִthe_test_dataִvarchar2(4000)ִnotִnull)
ִִ4ִִparallelִnologging;

Tableִcreated.

SQL>ִ
SQL>ִalterִsessionִenableִparallelִdml;

Sessionִaltered.

SQL>ִ
SQL>ִbeginִpk_px_parameter.set_px_parameter('THE_DATA',ִ'Thisִis
ִreadִbyִtheִparallelִslave!');ִend;
ִִ2ִִ/

PL/SQLִprocedureִsuccessfullyִcompleted.

SQL>ִ
SQL>ִcolumnִparameter_valueִformatִa40
SQL>ִ
SQL>ִselectִsys_context('CT_PX_PARAMETER',ִ'THE_DATA')ִasִparameter_valueִfromִdual;

PARAMETER_VALUE
----------------------------------------
Thisִis
ִreadִbyִtheִparallelִslave!

SQL>ִ
SQL>ִcreateִorִreplaceִpackageִpk_px_parameter_testִis
ִִ2ִִִִfunctionִget_the_sidִreturnִnumberִdeterministicִparallel_enable;
ִִ3ִִִִfunctionִget_the_test_dataִreturnִvarchar2ִdeterministicִparallel_enable;
ִִ4ִִendִpk_px_parameter_test;
ִִ5ִִ/

Packageִcreated.

SQL>ִ
SQL>ִcreateִorִreplaceִpackageִbodyִpk_px_parameter_testִis
ִִ2ִִִִg_n_the_sidִnumber;
ִִ3ִִִִg_s_the_test_dataִvarchar2(4000);
ִִ4ִִ
ִִ5ִִִִfunctionִget_the_sidִreturnִnumberִdeterministicִparallel_enableִis
ִִ6ִִִִbegin
ִִ7ִִ ִ
ִִifִg_n_the_sidִisִnullִthen
ִִ8ִִ ִִִ
ִִg_n_the_sidִ:=ִpk_px_parameter.get_my_sid;
ִִ9ִִ ִ
ִִendִif;
ִ10ִִ
ִ11ִִ ִ
ִִreturnִg_n_the_sid;
ִ12ִִִִendִget_the_sid;
ִ13ִִ
ִ14ִִִִfunctionִget_the_test_dataִreturnִvarchar2ִdeterministicִparallel_enableִis
ִ15ִִִִbegin
ִ16ִִ ִ
ִִifִg_s_the_test_dataִisִnullִthen
ִ17ִִ ִִִ
ִִg_s_the_test_dataִ:=ִ'I''mִsession:ִ'ִ||ִpk_px_parameter.get_my_sidִ||ִ'ִandִIִgotִthis:ִ'ִ||ִpk_px_parameter.get_px_parameter('THE_DATA');
ִ18ִִ ִ
ִִendִif;
ִ19ִִ
ִ20ִִ ִ
ִִreturnִg_s_the_test_data;
ִ21ִִִִend;
ִ22ִִendִpk_px_parameter_test;
ִ23ִִ/

Packageִbodyִcreated.

SQL>ִ
SQL>ִinsertִ/*+ִappendִparallel(d)ִ*/ִintoִtest_dest_dataִdִ(the_sid,ִthe_test_data)
ִִ2ִִselectִpk_px_parameter_test.get_the_sid,ִpk_px_parameter_test.get_the_test_data
ִִ3ִִfromִtest_source_data;

98445ִrowsִcreated.

SQL>ִ
SQL>ִcommit;

Commitִcomplete.

SQL>ִ
SQL>ִcolumnִthe_test_dataִformatִa40
SQL>ִ
SQL>ִselectִthe_sid,ִthe_test_data,ִcount(*)ִfromִtest_dest_data
ִִ2ִִgroupִbyִthe_sid,ִthe_test_data;

ִִִTHE_SIDִTHE_TEST_DATAִִִִִִִִִִִִִִִִִִִִִִִִִִִִִִCOUNT(*)
----------ִ----------------------------------------ִ----------
ִִִִִִִ116ִI'mִsession:ִ116ִandִIִgotִthis:ִThisִisִִִִִִ42005
ִִִִִִִִִִִread
ִbyִtheִparallelִslave!

ִִִִִִִ117ִI'mִsession:ִ117ִandִIִgotִthis:ִThisִisִִִִִִ56440
ִִִִִִִִִִִreadִby
ִtheִparallelִslave!


SQL>ִ
SQL>ִspoolִoff

This version of the script requires the CREATE ANY CONTEXT system privilege to work properly. If you want to be able to drop contexts, you need in addition the DROP ANY CONTEXT system privilege granted.

Note that this version of PK_PX_PARAMETER does not need a table to store the parameter information, but therefore consumes potentially more memory depending on the amount of data you attempt to pass and the way Oracle handles internally the propagation of the context to the parallel execution slaves.

In case you need to pass larger content I recommend using the database table based approach. If the data you need to pass is more of lightweight nature, you could use the context/namespace approach.