Tuesday, June 22, 2010

DBMS_STATS - Gather table statistics with many columns

Here is another good reason why you probably don't want to use tables with too many columns.

The first good reason is that Oracle stores rows of conventional heap tables with more than 255 columns (where at least one column value after the 255th is non-null) in multiple row pieces even when the entire row may fit into a single block (and up to 11.2 doesn't support basic and OLTP heap table compression on tables with more than 255 columns). This leads to something that is sometimes called "intra-row chaining" which means that Oracle needs to follow the row piece pointer to access columns after the 255th one leading to multiple logical I/Os per row, up to four for a row approaching the hard limit of 1,000 columns.

Of course such a multiple-piece row easily can result in actual row chaining where the row pieces are scattered across different blocks now potentially leading to additional random single block I/O, but I digress...

This simple example allows to see this (and the other, yet to describe) effect in action - it creates by default a table with 1,000 columns with 10,000 rows where each row resides in a separate block (and therefore only intra-row chaining should occur). I used a 8K block size tablespace with Manual Segment Space Management (MSSM) in order to avoid the ASSM overhead in this case:


----------------------------------------------------------------------------------------------------------
-- MANY_X_COLS.SQL
-- Usage: @MANY_X_COLS [NUM_COLS] [PCTFREE] [PCTUSED] [TABLESPACE] [COL1] [COL2] [COL3] [COL4] [NUM_ROWS]
-- Defaults: [1000 ] [99 ] [1 ] [TEST_8K ] [001 ] [256 ] [512 ] [768 ] [10000 ]
--
-- Create a table MANY_X_COLS with a configurable number of columns
-- and populate four columns of it using skewed data (normal distribution)
----------------------------------------------------------------------------------------------------------

set termout off verify off
REM Save current settings
store set .settings replace

REM Command line handling

REM Preparation for default value handling
column 1 new_value 1
column 2 new_value 2
column 3 new_value 3
column 4 new_value 4
column 5 new_value 5
column 6 new_value 6
column 7 new_value 7
column 8 new_value 8
column 9 new_value 9

select
'' as "1"
, '' as "2"
, '' as "3"
, '' as "4"
, '' as "5"
, '' as "6"
, '' as "7"
, '' as "8"
, '' as "9"
from
dual
where
rownum = 0;

REM Default values
select
nvl('&&1', '1000') as "1"
, nvl('&&2', '99') as "2"
, nvl('&&3', '1') as "3"
, nvl('&&4', 'TEST_8K') as "4"
, nvl('&&5', '001') as "5"
, nvl('&&6', '256') as "6"
, nvl('&&7', '512') as "7"
, nvl('&&8', '768') as "8"
, nvl('&&9', '10000') as "9"
from
dual;

define n_iter = &1
define n_pctfree = &2
define n_pctused = &3
define s_tblspace = &4
define s_col1 = &5
define s_col2 = &6
define s_col3 = &7
define s_col4 = &8
define n_num_rows = &9

set termout on echo on verify on

declare
s_sql1 varchar2(32767);
s_sql varchar2(32767);

e_table_or_view_not_exists exception;
pragma exception_init(e_table_or_view_not_exists, -942);
begin
for i in 1..&n_iter loop
s_sql1 := s_sql1 || 'col' || to_char(i, 'FM9000') || ' varchar2(10),';
end loop;

s_sql1 := rtrim(s_sql1, ',');

s_sql := 'drop table many_x_cols purge';

begin
execute immediate s_sql;
exception
when e_table_or_view_not_exists then
null;
end;

s_sql := 'create table many_x_cols (' || s_sql1 || ') pctfree &n_pctfree pctused &n_pctused tablespace &s_tblspace';

execute immediate s_sql;

dbms_random.seed(0);

s_sql := 'insert /*+ append */ into many_x_cols (
col&s_col1
, col&s_col2
, col&s_col3
, col&s_col4
)
select
trunc(sys.dbms_random.normal * &n_num_rows.) as col&s_col1
, trunc(sys.dbms_random.normal * &n_num_rows.) as col&s_col2
, trunc(sys.dbms_random.normal * &n_num_rows.) as col&s_col3
, trunc(sys.dbms_random.normal * &n_num_rows.) as col&s_col4
from
dual
connect by
level <= &n_num_rows
';

execute immediate s_sql;

commit;
end;
/

set termout off
REM Restore and cleanup
undefine 1 2 3 4 5 6 7 8 9
column 1 clear
column 2 clear
column 3 clear
column 4 clear
column 5 clear
column 6 clear
column 7 clear
column 8 clear
column 9 clear
@.settings
set termout on


If you have created the table with the defaults and now run some simple queries on it, for example:


SET AUTOTRACE TRACEONLY STATISTICS

SELECT COUNT(*) FROM MANY_X_COLS;


you should see approximately 10,000 consistent gets after some initial runs to avoid the overhead of hard parsing. Your output should look similar to the following:


SQL> SELECT COUNT(*) FROM MANY_X_COLS;

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
10004 consistent gets
10000 physical reads
0 redo size
422 bytes sent via SQL*Net to client
415 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed


However if you start to access columns after the 255th one (and you've used the defaults or suitable values to populate at least one these columns), then you'll notice that the number of consistent gets increases whereas the number of physical reads stays more or less the same:


SET AUTOTRACE TRACEONLY STATISTICS

SELECT COUNT(COL256) FROM MANY_X_COLS;



SQL> SELECT COUNT(COL256) FROM MANY_X_COLS;

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
30004 consistent gets
10000 physical reads
0 redo size
427 bytes sent via SQL*Net to client
415 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed


It is interesting to note that in 11.1.0.7 and 11.2.0.1 it happened sometimes that the consistent gets jumped directly from 10,000 to 30,000 and increased then to 40,000 and 50,000 respectively when accessing later columns whereas in 10.2.0.4 the expected result of 20,000, 30,000 and 40,000 was seen.

It is also interesting to note that the statistic "table fetch continued row" is only increasing once per row which could be another interesting variation of the theme "How does Oracle record this statistic".

OK, back to the main topic for today: DBMS_STATS and such tables with many columns.

The other good reason is that DBMS_STATS might switch to a multi-pass mode if a table exceeds a certain number of columns. This means even if you request only basic column statistics (METHOD_OPT=>"FOR ... COLUMNS SIZE 1...") without any additional histograms that require additional passes anyway DBMS_STATS.GATHER_TABLE_STATS will perform multiple passes (full segment scans usually) on the same segment. This is even true if basic column statistics are only requested for a limited set of columns (for example "FOR COLUMNS SIZE 1 COL1, COL2, COL3") since Oracle gathers some column statistics always no matter what options are passed to DBMS_STATS (very likely in order to calculate for example the average row length for the table statistics).

The threshold when DBMS_STATS switches to multiple passes depends the options passed to DBMS_STATS and differs between Oracle versions 11g (11.1 and 11.2) and versions prior to 11g - and I'm not talking about the multiple passes that are used in versions prior to 11g to determine the optimal auto sample size (DBMS_STATS.AUTO_SAMPLE_SIZE, see below for more information).

In Oracle 10.2.0.4 DBMS_STATS is capable of handling a minimum of approximately 160 columns per pass if basic columns statistics are requested and a maximum of approximately 440 columns if no column statistics are requested, which means that it requires between three and seven passes to gather statistics for a 1,000 columns table. Below is what such a typical query looks like - and you'll find up to seven of them in a SQL trace file of a DBMS_STATS call with a different column list for a 1,000 columns table.

Notice that in 10.2.0.4 DBMS_STATS gathers five different values per column for basic column statistics:

- Number of non-null values
- Number of distinct values
- Column size
- Minimum value
- Maximum value

If no column statistics are requested then still the "Number of non-null values" and "Column size" per column are gathered which explains why DBMS_STATS is capable of handling more columns per pass since simply less information per column is requested.


select /*+ no_parallel(t) no_parallel_index(t) dbms_stats
cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
*/ count(*),count("COL257"),count(distinct "COL257"),
sum(sys_op_opnsize("COL257")),substrb(dump(min("COL257"),16,0,32),1,120),
substrb(dump(max("COL257"),16,0,32),1,120),count("COL258"),count(distinct
"COL258"),sum(sys_op_opnsize("COL258")),substrb(dump(min("COL258"),16,0,32),
1,120),substrb(dump(max("COL258"),16,0,32),1,120),count("COL259"),
count(distinct "COL259"),sum(sys_op_opnsize("COL259")),
substrb(dump(min("COL259"),16,0,32),1,120),substrb(dump(max("COL259"),16,0,
32),1,120),...
.
.
.
from
"CBO_TEST"."MANY_X_COLS" t


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.20 0.19 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 1.21 3.07 9713 40010 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 1.41 3.26 9713 40010 0 1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 61 (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT GROUP BY (cr=40010 pr=9713 pw=0 time=3070537 us)
10000 TABLE ACCESS FULL MANY_X_COLS (cr=40010 pr=9713 pw=0 time=1820245 us)


Tracing the DBMS_STATS.GATHER_TABLE_STATS call furthermore reveals that Oracle performs at least one additional query when using DBMS_STATS.AUTO_SAMPLE_SIZE (the default) which is probably used for determining the optimal sample size because Oracle uses an iterative approach prior to 11g to determine the auto sample size. The sample size is increased in steps until the number of distinct values per column passes some statistical checks. The interesting thing however seems to be that these additional queries are repeated per pass:


select /*+ no_parallel(t) no_parallel_index(t) dbms_stats
cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
*/ count(*)
from
"CBO_TEST"."MANY_X_COLS" sample block ( 1.0000000000,1) t

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 7 0.01 0.00 0 0 0 0
Execute 7 0.00 0.00 0 0 0 0
Fetch 7 0.12 3.67 1971 764 0 7
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 21 0.13 3.67 1971 764 0 7

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 61 (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=119 pr=332 pw=0 time=530678 us)
108 TABLE ACCESS SAMPLE MANY_X_COLS (cr=119 pr=332 pw=0 time=646431 us)

********************************************************************************

select /*+ no_parallel(t) no_parallel_index(t) dbms_stats
cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
*/ count(*)
from
"CBO_TEST"."MANY_X_COLS" t

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 7 0.00 0.00 0 0 0 0
Execute 7 0.00 0.00 0 0 0 0
Fetch 7 3.41 15.85 65858 70070 0 7
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 21 3.41 15.86 65858 70070 0 7

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 61 (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=10010 pr=9445 pw=0 time=2228265 us)
10000 TABLE ACCESS FULL MANY_X_COLS (cr=10010 pr=9445 pw=0 time=816529 us)


This means that in total between six (no column statistics) and at least 14 queries (basic column statistics for all columns) are used to gather the statistics for a 1,000 columns table in 10.2.0.4 with DBMS_STATS.AUTO_SAMPLE_SIZE.

Using a fixed sample size (or compute) it takes between three and seven queries to gather the statistics for a 1,000 columns table in 10.2.0.4. When computing the basic statistics for all columns this means a minimum of 220,000 consistent gets to gather the statistics on a 10,000 blocks table with 1,000 columns in comparison to approximately 20,000 consistent gets to perform the same on a 255 columns table - not taking into account the CPU overhead of performing the aggregates (count distinct, min, max, etc.) on all those columns.
Note that these consistent gets exclude the work required to maintain the dictionary and statistics history information.

The additional queries executed when using DBMS_STATS.AUTO_SAMPLE_SIZE in 10.2.0.4 increase the number of consistent gets to 300,000 for the 1,000 columns / 10,000 blocks table (in this case the automatism ends up in a compute) and 40,000 consistent gets for the 255 columns / 10,000 blocks table.

Of course the intention of the AUTO_SAMPLE_SIZE is to find an optimal sample size and not to actually increase the amount of work compared to a compute gather statistics, but in this particular case this was true.

This significant difference in consistent gets between the 1,000 columns and 255 columns table both having 10,000 blocks has three reasons:

1. It requires more consistent gets per pass due to the multiple row pieces (up to four times more consistent gets)
2. DBMS_STATS performs many more passes on the 1,000 columns table (when gathering basic column statistics for all columns at least seven on the 1,000 columns table compared to at least two on the 255 columns table)
3. Obviously DBMS_STATS doesn't make a systematic attempt to minimize the number of consistent gets per pass by trying to access only leading columns per pass (so to require only 10,000 / 20,000 / 30,000 / 40,000 gets per pass). It looks more like the columns accessed per pass are randomly chosen often causing the maximum number of consistent gets per pass

The same applies to version 11.1 and 11.2 as long as fixed sample sizes are used. However when using DBMS_STATS.AUTO_SAMPLE_SIZE the whole gather statistics logic has been revised, in particular due to the introduction of the improvements regarding number of distinct column values approximation and incremental partition statistics, which is described in great detail by Amit Poddar here.

Due to these improvements to DBMS_STATS when using DBMS_STATS.AUTO_SAMPLE_SIZE Oracle 11.1.0.7 (and 11.2.0.1) are capable of handling up to 1,000 columns in a single pass if no columns statistics are requested and a minimum of approximately 260 columns per pass if basic column statistics are requested. This means it takes four passes to gather basic column statistics on all columns for a 1,000 columns table. Furthermore due to the revised logic the additional queries used in 10.2.0.4 for the DBMS_STATS.AUTO_SAMPLE_SIZE are no longer used.

Tracing the DBMS_STATS.GATHER_TABLE_STATS call in 11.1.0.7 and 11.2.0.1 also reveals that there is a special operation activated that is not represented in SQL. Accompanied by a comment outlining what the special NDV operation is supposed to do DBMS_STATS seems to activate a "APPROXIMATE NDV AGGREGATE" operation that is very likely used for the aforementioned improvement regarding the NDV algorithm.

Such a query looks then like the following:


select /*+ full(t) no_parallel(t) no_parallel_index(t) dbms_stats
cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
no_substrb_pad */to_char(count("COL174")),to_char(substrb(dump(min("COL174")
,16,0,32),1,120)),to_char(substrb(dump(max("COL174"),16,0,32),1,120)),
to_char(count("COL175")),to_char(substrb(dump(min("COL175"),16,0,32),1,120))
,to_char(substrb(dump(max("COL175"),16,0,32),1,120)),...
.
.
.
from
"CBO_TEST"."MANY_X_COLS" t /* NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,
NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,
NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,
NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,
.
.
.
NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL*/


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.11 0.10 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 1.01 2.24 9845 30010 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 1.12 2.35 9845 30010 0 1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 93 (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=30010 pr=9845 pw=0 time=0 us)
8755 APPROXIMATE NDV AGGREGATE (cr=30010 pr=9845 pw=0 time=2336556 us cost=2 size=121114 card=82)
10000 TABLE ACCESS FULL MANY_X_COLS (cr=30010 pr=9845 pw=0 time=300223 us cost=2 size=121114 card=82)


The following differences to 10g become obvious:

1. The SQL part has been reduced to

- Number of non-null values
- Minimum value
- Maximum value

for basic column statistics. The number of distinct values is obviously performed as part of the special operation "APPROXIMATE NDV AGGREGATE" as it can be seen in the "Row Source Operation" (for further details refer to Amit Poddar's document). Note that in case of no column statistics this is reduced to a single expression per column which is the "Number of non-null values".

2. The special operation "APPROXIMATE NDV AGGREGATE" appears in the execution plan

3. The trailing comment section of the statement describes what the "APPROXIMATE NDV AGGREGATE" is supposed to perform, for more details refer to the aforementioned document by Amit Poddar

4. All expressions are converted to strings using TO_CHAR

5. A new hint NO_SUBSTRB_PAD has been introduced (example taken from 11.2.0.1)

Probably the fact that the number of expressions in the SQL part has been reduced allows Oracle 11g to process more columns per pass.

This improved variant requires approximately 150,000 consistent gets (four passes) for basic column statistics for all columns of a 10,000 blocks / 1,000 columns table with DBMS_STATS.AUTO_SAMPLE_SIZE in comparison to 10,000 consistent gets for table with 255 columns (single pass).

So even with the improvements of 11g approximate NDV algorithm it will still require multiple passes when requesting basic column statistics (which generally speaking is recommended) for tables with more than 260 columns. If you think about the consequence of that in case of a table that not only has many columns but also many rows (that are then at least intra-row chained) the potential overhead becomes quite obvious.

So in summary the best way to deal with such tables is to avoid them in first place - in Oracle 10g attempt to keep the number of columns below 160 respectively 260 in 11g when using DBMS_STATS.AUTO_SAMPLE_SIZE to ensure a single pass operation covering all columns.

If you really need to use that many columns it might be worth the effort to write a custom gather statistics procedure that explicitly states the columns to gather statistics for in the METHOD_OPT parameter in order to minimize the number of passes required - of course keeping in mind the potential side effects on the cost based optimizer when column statistics are missing and these become relevant for optimization. In 11g with DBMS_STATS.AUTO_SAMPLE_SIZE this potentially allows to reduce the number of passes down to a single one.

And as a side note - using the deprecated ANALYZE ... STATISTICS command requires exactly a single pass to gather basic column statistics for all 1,000 columns of the 10,000 rows table - it even records only 10,000 consistent gets (and no "table fetch continued row") which is quite interesting since it is unclear how the different row pieces are accessed then. It might however simply be an instrumentation issue. Of course in most cases it doesn't make sense to revert to ANALYZE ... STATISTICS due to its shortcomings (just to name a few: No parallelism, no statistics history, no global statistics on partitioned tables and no approximate NDV improvements in 11g) and differences to DBMS_STATS that might lead to plan changes - see also the other notes linked to in that note by Jonathan Lewis.