Saturday, June 11, 2022

Automatic Indexing - Part XIV

Discussed many times the importance of clustering factor in relation to the efficiency of indexes. With respect to the efficiency of AI including their usage within Oracle autonomous database environments, data clustering is just as important.
 
The following demo was run on Oracle 19c (19.13) running on Exadata cloud@customer platform, will begin by creating a simple table that has column C1 in which the data is populated in a manner where the data is very poorly clustered. 
 
ai_demo@PDB19> create table demo015
  2  nologging
  3  as
  4  select rownum as id,
  5         mod(rownum,100)+1 as C1 ,
  6         a.*
  7  from all_objects a, all_users b,
  8                all_users c
  9  where rownum <= 10000000;
 
Table created.
 
So we have 100 evenly distributed values, but they are distributed through the table. The following sql is basically returning 1% of the data and is executed a number of times..
 
ai_demo@PDB19> set autotrace traceonly exp statistics
ai_demo@PDB19> select max(object_id), count(*) from demo015 where c1 = 57;
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 1474407985
 
---------------------------------------------------------------------------
| Id  | Operation                  | Name    | Rows  | Bytes | Cost (%CPU)|
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |         |     1 |     7 | 51034   (1)|
|   1 |  SORT AGGREGATE            |         |     1 |     7 |            |
|*  2 |   TABLE ACCESS STORAGE FULL| DEMO015 |   100K|   683K| 51034   (1)|
---------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - storage("C1"=57)
       filter("C1"=57)
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     187097  consistent gets
     187091  physical reads
          0  redo size
        428  bytes sent via SQL*Net to client
        546  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
Without an index, the CBO currently has no choice but use a full table scan to access the table, so we wait for the next AI process to kick in
 
ai_demo@PDB19> host timeout /T 900
 
ai_demo@PDB19> select dbms_auto_index.report_activity( activity_start=> systimestamp - 1/24 ) report from dual;
 
REPORT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
 Activity start               : 29-MAR-2022 05:15:12
 Activity end                 : 29-MAR-2022 06:15:13
 Executions completed         : 4
 Executions interrupted       : 0
 Executions with fatal error  : 0
-------------------------------------------------------------------------------
 
SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
 Index candidates            : 4
 Indexes created             : 0
 Space used                  : 0 B
 Indexes dropped             : 0
 SQL statements verified     : 0
 SQL statements improved     : 0
 SQL plan baselines created  : 0
 Overall improvement factor  : 0x
-------------------------------------------------------------------------------
 
SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
 Unused indexes    : 0
 Space used        : 0 B
 Unusable indexes  : 0
-------------------------------------------------------------------------------
 
ERRORS
---------------------------------------------------------------------------------------------
No errors found.
---------------------------------------------------------------------------------------------
 
 
The automatic indexing report makes no mention of AI on that table. If we look to see if any indexes have actually been created.
 
ai_demo@PDB19> select table_name,index_name,tablespace_name,clustering_factor,
  2         status,visibility,indexing,auto
  3  from  user_indexes
  4  where table_name = 'DEMO015'
  5  order by 1,2;
                                                        
TABLE_NAME  INDEX_NAME           TABLESPACE_NAME  CLUSTERING_FACTOR STATUS   VISIBILIT INDEXIN AUT
----------- ------------------------------------- ----------------- -------- --------- ------- ---
DEMO015     SYS_AI_cvby0u1d95hd2 TS_INDEX_DEMO              4234524 UNUSABLE INVISIBLE FULL    YES
 
ai_demo@PDB19>
ai_demo@PDB19> select table_name,index_name,column_name,column_position
  2  from user_ind_columns
  3  where table_name = 'DEMO015'
  4  order by 1,2,3;
                                                               
TABLE_NAME  INDEX_NAME           COLUMN_NAM COLUMN_POSITION
----------- -------------------- ---------- ---------------
DEMO015     SYS_AI_cvby0u1d95hd2 C1                       1
 
 
We can see that AI has been created on the column C1, but it remains in an INVISIBLE and UNUSABLE state.
 
So Automatic Indexing considered an index on C1, created it in an INVISIBLE, USABLE state but when testing it, failed in that it found it to be less efficient than the current FTS and so reverted the Automatic Index back to an UNUSABLE index.
 
Therefore, if we run a bunch of other similar SQL statement such as the following
 
ai_demo@PDB19> select max(object_id), count(*) from demo015 where c1 = 24;
 
MAX(OBJECT_ID)   COUNT(*)
-------------- ----------
          6953     100000
 
ai_demo@PDB19> select max(object_id), count(*) from demo015 where c1 = 57;
 
MAX(OBJECT_ID)   COUNT(*)
-------------- ----------
          6953     100000
 
ai_demo@PDB19> select max(object_id), count(*) from demo015 where c1 = 13;
 
MAX(OBJECT_ID)   COUNT(*)
-------------- ----------
          6953     100000
 
 
They all use the FTS again, the CBO has no choice with no valid index on C1 column available. If we keep checking the AI report..
 
Still no mention of an index on C1 column, the existing AI remains in the unusable state.
 
ai_demo@PDB19> select table_name,index_name,tablespace_name,clustering_factor,
  2         status,visibility,indexing,auto
  3  from  user_indexes
  4  where table_name = 'DEMO015'
  5  order by 1,2;
                                                        
TABLE_NAME  INDEX_NAME           TABLESPACE_NAME  CLUSTERING_FACTOR STATUS   VISIBILIT INDEXIN AUT
----------- ------------------------------------- ----------------- -------- --------- ------- ---
DEMO015     SYS_AI_cvby0u1d95hd2 TS_INDEX_DEMO              4234524 UNUSABLE INVISIBLE FULL    YES
 
ai_demo@PDB19>
ai_demo@PDB19> select table_name,index_name,column_name,column_position
  2  from user_ind_columns
  3  where table_name = 'DEMO015'
  4  order by 1,2,3;
                                                               
TABLE_NAME  INDEX_NAME           COLUMN_NAM COLUMN_POSITION
----------- -------------------- ---------- ---------------
DEMO015     SYS_AI_cvby0u1d95hd2 C1                       1
 
Basically the index remains inefficient because of clustering factor of 4234524, it is too inefficient to return 1% of data.  In the next post we will how we can improve the performance of this query. 
 
 
 

No comments:

Post a Comment