As I discuss in the previous
series the problems and inconsistencies can appear between what the Automatic
Indexing (AI) process thinks will happen with the newly created AI and what
actually happens in other database sessions, this is because AI process session
uses a much higher degree of dynamic sampling ( dynamic sampling = 11 ) than
other database session uses by default (level =2)
As we in the previous
series, an SQL statement may be deemed not to use an Index in the AI
deliberations, where it is actually used in the normal session (perhaps
incorrectly so) where the data is heavily skewed and current statistics are
insufficient for the CBO to accurately detect such skewness is one such
scenario where we might encounter this issue.
One option to get around this is to hint
any such queries with a dynamic sampling value that matches that of the AI
process.
If we now re-run the problematic query
from the previous
series (where a new AI was inappropriately used by the CBO) with such a
dynamic sampling hint.
ai_demo@PDB19> show
parameter sampling
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_dynamic_sampling integer 2
ai_demo@PDB19>
ai_demo@PDB19> set autotrace traceonly exp statistics
ai_demo@PDB19>
ai_demo@PDB19> select x1,x2,object_name
2 from DEMO018a
3 where x1 = 17
4 and x2 = 17;
100000 rows
selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2414111795
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 30000 | 560 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| DEMO018A | 1000 | 30000 | 560 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | SYS_AI_gawwp6r5fuzfr | 1000 | | 4 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
2 - access("X1"=17 AND
"X2"=17)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
57097 consistent gets
55520 physical reads
0 redo size
4068922 bytes sent via SQL*Net to client
7872 bytes received via SQL*Net from client
668 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
100000 rows processed
ai_demo@PDB19> select /*+ dynamic_sampling(11) */ x1,x2,object_name
2 from DEMO018a
3 where x1 = 17
4 and x2 = 17;
100000 rows
selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3250860338
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 103K| 3034K| 15208 (1)| 00:00:01 |
|* 1 | TABLE ACCESS STORAGE FULL| DEMO018A | 105K| 3085K| 15208 (1)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
1 - storage("X1"=17 AND
"X2"=17)
filter("X1"=17 AND "X2"=17)
Note
-----
- dynamic statistics used: dynamic sampling (level=AUTO)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
56197 consistent gets
0 physical reads
0 redo size
2835214 bytes sent via SQL*Net to client
7872 bytes received via SQL*Net from client
668 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
100000 rows processed
ai_demo@PDB19> set
autotrace off
We can see that with Dynamic sampling
hint (level=11) in place, the CBO this time correctly calculated the
cardinality and hence correctly decided against the use of the AI.
Although these parameters can’t be
changed in Oracle Autonomous database cloud services, on Exadata platform if
using AI we might want to consider setting the OPTIMIZER_DYNAMIC_SAMPLING to 11
and OPTIMIZER_ADAPTIVE_STATISTICS to true in order to be consistent with AI
process. These settings can obviously add significant overhead during parsing
and so need to be set with caution.
In these scenarios where there is an
inherent relationship between columns which the CBO is not detecting, the
creation of extended statistics can be beneficial
We currently have the following columns
and statistics in the base table
ai_demo@PDB19> select
column_name, num_distinct, density,num_nulls, num_buckets,histogram
2 from all_tab_cols
3 where table_name ='DEMO018A'
4 and owner ='AI_DEMO';
COLUMN_NAME NUM_DISTINCT DENSITY
NUM_NULLS NUM_BUCKETS HISTOGRAM
--------------- ------------ ---------- ---------- ----------- ---------------
X1 100 .00000005 0 100 FREQUENCY
X2 100 .00000005 0 100 FREQUENCY
ID 9914368 1.0086E-07 0 1 NONE
OBJECT_NAME 94824 .000010546 0 1 NONE
If we now collect extended statistics on
both X1 and X2 columns
ai_demo@PDB19> begin
2 dbms_stats.gather_table_stats(user,'DEMO018A',
3 method_opt=>'for columns(x1,x2) size 254' );
4 end;
5 /
PL/SQL procedure successfully
completed.
ai_demo@PDB19> select
column_name, num_distinct, density,num_nulls, num_buckets,histogram
2 from all_tab_cols
3 where table_name ='DEMO018A'
4 and owner ='AI_DEMO';
COLUMN_NAME NUM_DISTINCT DENSITY
NUM_NULLS NUM_BUCKETS HISTOGRAM
--------------- ------------ ---------- ---------- ----------- ---------------
X1 100 .00000005 0 100 FREQUENCY
X2 100 .00000005 0 100 FREQUENCY
ID 9914368 1.0086E-07 0 1 NONE
OBJECT_NAME 94824 .000010546 0 1 NONE
SYS_STUXTXIJPH7 100 .00000005 0 100 FREQUENCY
P8F5TH2A09PZ12X
The CBO now have some idea on the
cardinality if both the columns are used within the predicates, if we re-run
the problematic query without the hint
ai_demo@PDB19> select
x1,x2,object_name
2 from DEMO018a
3 where x1 = 17
4 and x2 = 17;
100000 rows
selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3250860338
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 2929K| 15208 (1)| 00:00:01 |
|* 1 | TABLE ACCESS STORAGE FULL| DEMO018A | 100K| 2929K| 15208 (1)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
1 - storage("X1"=17 AND
"X2"=17)
filter("X1"=17 AND "X2"=17)
Statistics
----------------------------------------------------------
5 recursive calls
0 db block gets
56200 consistent gets
0 physical reads
0 redo size
2835214 bytes sent via SQL*Net to client
7872 bytes received via SQL*Net from client
668 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
100000 rows processed
The CBO is correctly estimating the
cardinality of 100K rows and so is not using the automatic index. However we
can still get ourselves in problem, if we now re-run the query that returns no
rows
ai_demo@PDB19> select
x1,x2,object_name
2 from DEMO018a
3 where x1 = 1
4 and x2 = 2;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 3250860338
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 50000 | 1464K| 15208 (1)| 00:00:01 |
|* 1 | TABLE ACCESS STORAGE FULL| DEMO018A | 50000 | 1464K| 15208 (1)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
1 - storage("X1"=1 AND
"X2"=2)
filter("X1"=1 AND "X2"=2)
Statistics
----------------------------------------------------------
2 recursive calls
0 db block gets
55537 consistent gets
0 physical reads
0 redo size
369 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
We can see that the CBO is now getting
this execution plan wrong and is now estimating incorrectly that 50K rows are
to be returned (and not about 1000 rows as estimated previously), this
increased estimate is now deemed too expensive for the AI to retrieve and is
now incorrectly using the Full table scan.
This is because with frequency based
histogram in place, Oracle assumes that 50% of the lowest recorded frequency
within the histogram is returned, if the value doesn’t exists but resides
within the know min-max range of values.
ai_demo@PDB19> select
endpoint_number , endpoint_number/2 as half_value
2 from user_tab_histograms
3 where table_name ='DEMO018A'
4 and column_name like 'SYS%'
5 order by endpoint_number
6 fetch first 1 row only ;
ENDPOINT_NUMBER HALF_VALUE
--------------- ----------
100000 50000
So we need to be very careful about how
to potentially collect any additional statistics and its potential impact on
other SQL statements.
One possible solution would be to use DS
hint with level = 11.
ai_demo@PDB19> select /*+
dynamic_sampling(11) */ x1,x2,object_name
2 from DEMO018a
3 where x1 = 1
4 and x2 = 2;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 2414111795
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 60 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| DEMO018A | 2 | 60 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | SYS_AI_gawwp6r5fuzfr | 1 | | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
2 - access("X1"=1 AND
"X2"=2)
Note
-----
- dynamic statistics used: dynamic sampling (level=AUTO)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
369 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
We can see that with Dynamic sampling
hint (level=11) in place, the CBO this time correctly calculated the
cardinality and hence correctly decided the use of the AI.
------------------------------------ ----------- ------------------------------
optimizer_dynamic_sampling integer 2
ai_demo@PDB19>
ai_demo@PDB19> set autotrace traceonly exp statistics
ai_demo@PDB19>
ai_demo@PDB19> select x1,x2,object_name
2 from DEMO018a
3 where x1 = 17
4 and x2 = 17;
----------------------------------------------------------
Plan hash value: 2414111795
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 30000 | 560 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| DEMO018A | 1000 | 30000 | 560 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | SYS_AI_gawwp6r5fuzfr | 1000 | | 4 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
---------------------------------------------------
----------------------------------------------------------
0 recursive calls
0 db block gets
57097 consistent gets
55520 physical reads
0 redo size
4068922 bytes sent via SQL*Net to client
7872 bytes received via SQL*Net from client
668 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
100000 rows processed
2 from DEMO018a
3 where x1 = 17
4 and x2 = 17;
----------------------------------------------------------
Plan hash value: 3250860338
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 103K| 3034K| 15208 (1)| 00:00:01 |
|* 1 | TABLE ACCESS STORAGE FULL| DEMO018A | 105K| 3085K| 15208 (1)| 00:00:01 |
--------------------------------------------------------------------------------------
---------------------------------------------------
filter("X1"=17 AND "X2"=17)
-----
- dynamic statistics used: dynamic sampling (level=AUTO)
----------------------------------------------------------
0 recursive calls
0 db block gets
56197 consistent gets
0 physical reads
0 redo size
2835214 bytes sent via SQL*Net to client
7872 bytes received via SQL*Net from client
668 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
100000 rows processed
2 from all_tab_cols
3 where table_name ='DEMO018A'
4 and owner ='AI_DEMO';
--------------- ------------ ---------- ---------- ----------- ---------------
X1 100 .00000005 0 100 FREQUENCY
X2 100 .00000005 0 100 FREQUENCY
ID 9914368 1.0086E-07 0 1 NONE
OBJECT_NAME 94824 .000010546 0 1 NONE
2 dbms_stats.gather_table_stats(user,'DEMO018A',
3 method_opt=>'for columns(x1,x2) size 254' );
4 end;
5 /
2 from all_tab_cols
3 where table_name ='DEMO018A'
4 and owner ='AI_DEMO';
--------------- ------------ ---------- ---------- ----------- ---------------
X1 100 .00000005 0 100 FREQUENCY
X2 100 .00000005 0 100 FREQUENCY
ID 9914368 1.0086E-07 0 1 NONE
OBJECT_NAME 94824 .000010546 0 1 NONE
SYS_STUXTXIJPH7 100 .00000005 0 100 FREQUENCY
P8F5TH2A09PZ12X
2 from DEMO018a
3 where x1 = 17
4 and x2 = 17;
----------------------------------------------------------
Plan hash value: 3250860338
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 2929K| 15208 (1)| 00:00:01 |
|* 1 | TABLE ACCESS STORAGE FULL| DEMO018A | 100K| 2929K| 15208 (1)| 00:00:01 |
--------------------------------------------------------------------------------------
---------------------------------------------------
filter("X1"=17 AND "X2"=17)
----------------------------------------------------------
5 recursive calls
0 db block gets
56200 consistent gets
0 physical reads
0 redo size
2835214 bytes sent via SQL*Net to client
7872 bytes received via SQL*Net from client
668 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
100000 rows processed
2 from DEMO018a
3 where x1 = 1
4 and x2 = 2;
----------------------------------------------------------
Plan hash value: 3250860338
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 50000 | 1464K| 15208 (1)| 00:00:01 |
|* 1 | TABLE ACCESS STORAGE FULL| DEMO018A | 50000 | 1464K| 15208 (1)| 00:00:01 |
--------------------------------------------------------------------------------------
---------------------------------------------------
filter("X1"=1 AND "X2"=2)
----------------------------------------------------------
2 recursive calls
0 db block gets
55537 consistent gets
0 physical reads
0 redo size
369 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
2 from user_tab_histograms
3 where table_name ='DEMO018A'
4 and column_name like 'SYS%'
5 order by endpoint_number
6 fetch first 1 row only ;
--------------- ----------
100000 50000
2 from DEMO018a
3 where x1 = 1
4 and x2 = 2;
----------------------------------------------------------
Plan hash value: 2414111795
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 60 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| DEMO018A | 2 | 60 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | SYS_AI_gawwp6r5fuzfr | 1 | | 3 (0)| 00:00:01 |
---------------------------------------------------
-----
- dynamic statistics used: dynamic sampling (level=AUTO)
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
369 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
No comments:
Post a Comment