Sunday, October 16, 2022

Automatic indexing - Not supported - Part III

In the previous blogpost, discussed scenario where automatic indexing does not currently create automatic indexing and we may need to manually create necessary indexes.
 
In this post, we will discuss another scenario where AI will create an index, but we may need to manually create an even better one.
 
We will start by creating a large table 
 
ai_demo@ATP21C> create table demo035
  2  nologging as
  3  select rownum as id,
  4    mod(rownum,100000)+1 as x1,
  5    cast('Hello_world' as varchar2(20)) as x2,
  6    a.*
  7  from stage a, stage b
  8  where rownum <= 10000000;
 
Table created.
 
The main columns to note here are X1 - which contains 100K distinct values – and X2 that got only the same value “Hello_World” all together.
 
I will next run the following query a number of times..
 
ai_demo@ATP21C> select x2,x1 from demo035 where x1 = 42;
 
100 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3123852742
 
-------------------------------------------------------------------------------------
| Id  | Operation                 | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |         |    98 |  1666 | 52048   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS STORAGE FULL| DEMO035 |    98 |  1666 | 52048   (1)| 00:00:03 |
-------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - storage("X1"=42)
       filter("X1"=42)
 
Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     191109  consistent gets
     191103  physical reads
          0  redo size
        964  bytes sent via SQL*Net to client
         48  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed
 
 
Without an index, the CBO has no choice but to perform a Full table scan. An index on X1 would provide the necessary filtering to fetch and return the required rows.
 
But if this query was important enough, we could improve thing further by “overloading” the column X2 into the same index. So we could use the index exclusively to get all the necessary data, without having to access the table at all. Considering an index on just the X1 column would need to fetch a reasonable number of rows (100 rows) and would need to visit a substantial number of different table block due to its clustering, overloading the index in this scenario would substantially reduce the necessary workloads of this query.
 
So what does AI do in this scenario, is overloading an index considered?
 
ai_demo@ATP21C> select dbms_auto_index.report_activity(systimestamp -1/24, systimestamp,'TEXT','ALL','ALL') report from dual;
 
REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
 Activity start               : 12-AUG-2022 09:34:09
 Activity end                 : 12-AUG-2022 10:34:09
 Executions completed         : 4
 Executions interrupted       : 0
 Executions with fatal error  : 0
-------------------------------------------------------------------------------
 
SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
 Index candidates                              : 1
 Indexes created (visible / invisible)         : 1 (1 / 0)
 Space used (visible / invisible)              : 134.22 MB (134.22 MB / 0 B)
 Indexes dropped                               : 0
 SQL statements verified                       : 1
 SQL statements improved (improvement factor)  : 1 (191348x)
 SQL plan baselines created                    : 0
 Overall improvement factor                    : 191348x
-------------------------------------------------------------------------------
 
SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
 Unused indexes    : 0
 Space used        : 0 B
 Unusable indexes  : 0
-------------------------------------------------------------------------------
 
INDEX DETAILS
-------------------------------------------------------------------------------
1. The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
------------------------------------------------------------------------
| Owner   | Table   | Index                | Key | Type   | Properties |
------------------------------------------------------------------------
| AI_DEMO | DEMO035 | SYS_AI_2v4tz32vgabbg | X1  | B-TREE | NONE       |
------------------------------------------------------------------------
-------------------------------------------------------------------------------
 
VERIFICATION DETAILS
-------------------------------------------------------------------------------
1. The performance of the following statements improved:
-------------------------------------------------------------------------------
 Parsing Schema Name  : AI_DEMO
 SQL ID               : 6zrhtjvmmvdz2
 SQL Text             : select x2,x1 from demo035 where x1 = 42
 Improvement Factor   : 191348x
 
Execution Statistics:
-----------------------------
                    Original Plan                 Auto Index Plan
                    ----------------------------  ----------------------------
 Elapsed Time (s):  482991                        53219
 CPU Time (s):      94369                         6075
 Buffer Gets:       574045                        103
 Optimizer Cost:    52048                         102
 Disk Reads:        573316                        102
 Direct Writes:     0                             0
 Rows Processed:    300                           100
 Executions:        3                             1
 
 
PLANS SECTION
---------------------------------------------------------------------------------------------
 
- Original
-----------------------------
 Plan Hash Value  : 3123852742
 
--------------------------------------------------------------------------------
| Id | Operation                   | Name    | Rows | Bytes | Cost  | Time     |
--------------------------------------------------------------------------------
|  0 | SELECT STATEMENT            |         |      |       | 52048 |          |
|  1 |   TABLE ACCESS STORAGE FULL | DEMO035 |   98 |  1666 | 52048 | 00:00:03 |
--------------------------------------------------------------------------------
 
Notes
-----
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no
 
 
- With Auto Indexes
-----------------------------
 Plan Hash Value  : 1691939753
 
-------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                 | Rows | Bytes | Cost | Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                      |   98 |  1666 |  102 | 00:00:01 |
|   1 |   TABLE ACCESS BY INDEX ROWID BATCHED | DEMO035              |   98 |  1666 |  102 | 00:00:01 |
| * 2 |    INDEX RANGE SCAN                   | SYS_AI_2v4tz32vgabbg |   98 |       |    3 | 00:00:01 |
-------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("X1"=42)
 
 
Notes
-----
- Dynamic sampling used for this statement ( level = 11 )
 
 
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
 
We see that automatic index on just the X1 column was created.
 
 
ai_demo@ATP21C> select index_name,index_type,indexing,auto,
  2         status,visibility,num_rows,leaf_blocks,clustering_factor
  3  from all_indexes
  4  where table_name = 'DEMO035'
  5  and owner ='AI_DEMO';
 
INDEX_NAME            INDEX_TYPE  INDEXIN AUT STATUS   VISIBILIT   NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
--------------------- ----------- ------- --- -------- --------- ---------- ----------- -----------------
SYS_AI_2v4tz32vgabbg  NORMAL      FULL    YES VALID    VISIBLE     10000000       15512          10000000
 
ai_demo@ATP21C> select table_name,index_name,column_name,column_position
  2  from all_ind_columns
  3  where table_name = 'DEMO035'
  4  order by 1,2;
 
TABLE_NAME INDEX_NAME            COLUMN_NAME COLUMN_POSITION
---------- --------------------- ----------- ---------------
DEMO035    SYS_AI_2v4tz32vgabbg  X1                        1
 
 
If we now re-run the query again
 
ai_demo@ATP21C> select x2,x1 from demo035 where x1 = 42;
 
100 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 1691939753
 
------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |    98 |  1666 |   102   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| DEMO035              |    98 |  1666 |   102   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | SYS_AI_2v4tz32vgabbg |    98 |       |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("X1"=42)
 
Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        104  consistent gets
          0  physical reads
          0  redo size
        964  bytes sent via SQL*Net to client
         48  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed
 
 
The query uses the newly create automatic index, but with 104 consistent gets, it is still doing substantial amount of work here, if we manually create another index that overloads the only other columns required in this query
 
ai_demo@ATP21C> create index demo035_idx
  2  on demo035( x1,x2 )
  3  nologging
  4  compress advanced low;
 
Index created.
 
Using compress advanced low as used by automatic index, noting the X2 contain the same value for all the rows, making it particularly perfect for compression and the best-case scenario when it comes to minimal overheads potentially associated with overloading this index
 
ai_demo@ATP21C> select index_name,index_type,indexing,auto,
  2         status,visibility,num_rows,leaf_blocks,clustering_factor
  3  from all_indexes
  4  where table_name = 'DEMO035'
  5  and owner ='AI_DEMO';
 
INDEX_NAME           INDEX_TYPE INDEXIN AUT STATUS VISIBILIT   NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- ---------- ------- --- ------ --------- ---------- ----------- -----------------
SYS_AI_2v4tz32vgabbg NORMAL     FULL    YES VALID  VISIBLE     10000000       15512          10000000
DEMO035_IDX          NORMAL     FULL    NO  VALID  VISIBLE     10000000       16667          10000000
 
So giving AI the best possible scenario, in which it could potentially create an overloaded index. But I have never been able to get AI to create overloaded index. Only columns in the filtering predicate are considered for inclusion in automatic indexes.
 
If we re-run the query again.
 
ai_demo@ATP21C> select x2,x1 from demo035 where x1 = 42;
 
100 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2152906945
 
--------------------------------------------------------------------------------
| Id  | Operation        | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |             |    98 |  1666 |     3   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| DEMO035_IDX |    98 |  1666 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - access("X1"=42)
 
Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          0  physical reads
          0  redo size
        964  bytes sent via SQL*Net to client
         48  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed
 
We notice the CBO now uses the manually created index without any table access path, as it can just use the index to access the necessary data. The number of consistent gets was significantly reduced to 5, a fraction of the previous 104 when the automatic index was used.
 
So the scenario of an overloaded index that could significantly reduce database resource, which is currently not supported by AI, is another example of where many want to manually create necessary index.