The Cost Based Optimizer (CBO)
is a rather complex piece of code that has to deal with countless different
possible scenarios when trying to determine what the most optimal execution
plan might be. It’s also a vitally important piece of code because not only do
the decisions need to be reasonably accurate so that it doesn’t generate
inefficient execution plans but it needs to make these decisions in a
reasonably efficient manner else it wastes resources and most importantly
wastes time while it performs its calculations.
So there’s a trade-off between
ensuring the CBO makes reasonable decisions while ensuring it makes its
decisions in a timely and resource efficient manner. Database performance could
be directly impacted if these trade-offs are not managed effectively.
Therefore, there are all sorts
of short cuts and assumptions that are coded into the CBO to make its life a
little easier.
Typically when we have a
condition where we just say NOT EQUAL, we’re basically suggesting we’re
interested in the vast majority of possible values with the exception of the
value specified in the NOT EQUAL condition. We want most values but not if it’s
this particular value.
For example, a condition where
we state something such as:
WHERE id <> 99
means we want all the other
possible values of 99, just not those with the specific value of 99. In other
words, we’re typically interested in the vast majority of possible values when
we specify a NOT EQUAL condition.
However, we all know that
typically, Oracle will not use an index if generally a relatively “high”
percentage of rows are to be selected. It would generally be more efficient and
less costly to simply perform a Full Table Scan if most rows are going to be
returned anyways.
Therefore the CBO simply
ignores indexes when costing a NOT EQUAL condition. Why bother going to all the
overhead of calculating the cost of using an index to retrieve the vast
majority of rows when a Full Table Scan is going to be the cheaper alternative
in the vast majority of such cases. So the CBO doesn’t even bother trying and
ignores all indexes that could potentially be used to retrieve the rows based
on the NOT EQUAL condition.
But what if the data isn’t
evenly distributed and the NOT EQUAL condition actually retrieves only a
relatively small proportion of the rows. What if most rows actually have the
value specified in the NOT EQUAL condition and the remaining rows constitute a
relatively small proportion of the remaining rows?
When the CBO ignores indexes,
it ignores indexes in all cases. Even if 99.99% of rows match the value in the
NOT EQUAL condition and there’s only a handful of remaining rows to actually be
retrieved, the code path in the CBO is still followed and indexes are ignored
regardless.
rajesh@ORA12C> create table t as
2 select a.*,
3 decode(rownum,1,1,99) x
4 from all_objects a;
Table created.
rajesh@ORA12C>
rajesh@ORA12C> create index t_idx on t(x);
Index created.
rajesh@ORA12C>
rajesh@ORA12C> begin
2 dbms_stats.gather_table_stats
3 (ownname=> user,
4 tabname=> 'T',
5 method_opt=>'for all indexed columns
size 5') ;
6 end;
7 /
PL/SQL procedure successfully completed.
rajesh@ORA12C> set autotrace traceonly explain
rajesh@ORA12C> select min(object_id) from t where x <>
99;
Execution Plan
----------------------------------------------------------
Plan hash value: 2966233522
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT
STATEMENT | |
1 | 8 | 432
(1)| 00:00:01 |
| 1 | SORT AGGREGATE |
| 1 | 8 | | |
|* 2 | TABLE ACCESS FULL| T |
2 | 16 | 432
(1)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("X"<>99)
If we look at relevant parts of
the 10053 trace, we note that CBO don’t even cost or consider using Index.
Access path analysis for T
***************************************
SINGLE TABLE ACCESS PATH
Single Table Cardinality
Estimation for T[T]
SPD: Return code in
qosdDSDirSetup: NOCTX, estType = TABLE
Column (#19):
NewDensity:0.000006,
OldDensity:0.000006 BktCnt:89955.000000, PopBktCnt:89954.000000,
PopValCnt:1, NDV:2
Column (#19): X(NUMBER)
AvgLen: 3 NDV: 2
Nulls: 0 Density: 0.000006 Min: 0.000000 Max: 1.000000
Histogram: Freq kts: 2
UncompBkts: 89955 EndPtVals:
2 ActualVal: yes
Table: T Alias: T
Card: Original:
89955.000000 Rounded: 2 Computed: 1.500000 Non Adjusted: 1.500000
Scan IO Cost (Disk) = 430.000000
Scan CPU Cost (Disk)
= 57157410.960000
Total Scan IO Cost
= 430.000000 (scan (Disk))
+
0.000000 (io filter eval) (= 0.000000 (per row) * 89955.000000
(#rows))
= 430.000000
Total Scan CPU Cost =
57157410.960000 (scan (Disk))
+
4497750.000000 (cpu filter eval) (= 50.000000 (per row) *
89955.000000 (#rows))
= 61655160.960000
Access Path: TableScan
Cost: 431.605050
Resp: 431.605050 Degree: 0
Cost_io:
430.000000 Cost_cpu: 61655161
Resp_io:
430.000000 Resp_cpu: 61655161
Best:: AccessPath: TableScan
Cost:
431.605050 Degree: 1 Resp: 431.605050 Card: 1.500000 Bytes: 0.000000
***************************************
No comments:
Post a Comment