Saturday, June 11, 2022

Automatic Indexing - Part XV

In the previous post, we discussed a scenario in which Oracle Automatic Indexing refused to create a VALID index, because the resultant index was too inefficient to access the necessary rows due to the poor clustering of data within the table.
 
If the performance of such an SQL were critical for business requirements, there is a way to address this scenario, by re-clustering the data within the table to align itself with the index. Although the re-clustering table operation can now be very easily performed online since Oracle Database 12.2 (without having to use the dbms_redefinition process), this is NOT automatically performed within the Autonomous Database self-tuning framework (yet).
 
But its an activity we can perform manually to improve the performance of such critical sql as follows.
 
ai_demo@PDB19> select table_name,index_name,tablespace_name,clustering_factor,
  2         status,visibility,indexing,auto
  3  from  user_indexes
  4  where table_name = 'DEMO015'
  5  order by 1,2;
 
TABLE_NAME INDEX_NAME            TABLESPACE_NAME  CLUSTERING_FACTOR STATUS   VISIBILIT INDEXIN AUT
---------- --------------------- ---------------- ----------------- -------- --------- ------- ---
DEMO015    SYS_AI_cvby0u1d95hd2  TS_INDEX_DEMO              4234524 UNUSABLE INVISIBLE FULL    YES
 
ai_demo@PDB19> alter table demo015 add clustering by linear order(c1);
 
Table altered.
 
ai_demo@PDB19> alter table demo015 move online;
 
Table altered.
 
ai_demo@PDB19> select table_name,index_name,tablespace_name,clustering_factor,
  2         status,visibility,indexing,auto
  3  from  user_indexes
  4  where table_name = 'DEMO015'
  5  order by 1,2;
 
no rows selected
 
with the data in the table now perfectly aligned with the index, we could expect the index to be more efficient method to retrieve the 1% of the data. 
 
If we rerun the sql each a number of times
 
ai_demo@PDB19> select max(object_id), count(*) from demo015 where c1 = 21;
 
MAX(OBJECT_ID)   COUNT(*)
-------------- ----------
          6953     100000
 
ai_demo@PDB19> select max(object_id), count(*) from demo015 where c1 = 22;
 
MAX(OBJECT_ID)   COUNT(*)
-------------- ----------
          6953     100000
 
ai_demo@PDB19> select max(object_id), count(*) from demo015 where c1 = 23;
 
MAX(OBJECT_ID)   COUNT(*)
-------------- ----------
          6953     100000
 
ai_demo@PDB19>
 
then wait until the next AI process period
 
ai_demo@PDB19> host timeout /T 900
 
ai_demo@PDB19> select dbms_auto_index.report_activity( activity_start=> systimestamp - 1/24 ) report from dual;
 
REPORT
----------------------------------------------------------
GENERAL INFORMATION
----------------------------------------------------------
 Activity start               : 29-MAR-2022 08:36:24
 Activity end                 : 29-MAR-2022 09:36:24
 Executions completed         : 4
 Executions interrupted       : 0
 Executions with fatal error  : 0
----------------------------------------------------------
 
SUMMARY (AUTO INDEXES)
----------------------------------------------------------
 Index candidates            : 5
 Indexes created             : 0
 Space used                  : 0 B
 Indexes dropped             : 0
 SQL statements verified     : 0
 SQL statements improved     : 0
 SQL plan baselines created  : 0
 Overall improvement factor  : 0x
----------------------------------------------------------
 
SUMMARY (MANUAL INDEXES)
----------------------------------------------------------
 Unused indexes    : 0
 Space used        : 0 B
 Unusable indexes  : 0
----------------------------------------------------------
 
ERRORS
----------------------------------------------------------
No errors found.
----------------------------------------------------------
 
We notice the AI is still not mentioned in the AI report and still remains UNUSABLE.
 
ai_demo@PDB19> select table_name,index_name,tablespace_name,clustering_factor,
  2         status,visibility,indexing,auto
  3  from  user_indexes
  4  where table_name = 'DEMO015'
  5  order by 1,2;
 
TABLE_NAME INDEX_NAME            TABLESPACE_NAME  CLUSTERING_FACTOR STATUS   VISIBILIT INDEXIN AUT
---------- --------------------- ---------------- ----------------- -------- --------- ------- ---
DEMO015    SYS_AI_cvby0u1d95hd2  TS_INDEX_DEMO              4234524 UNUSABLE INVISIBLE FULL    YES
 
ai_demo@PDB19> select table_name,index_name,column_name,column_position
  2  from user_ind_columns
  3  where table_name = 'DEMO015'
  4  order by 1,2,3;
 
TABLE_NAME  INDEX_NAME             COLUMN_NAM COLUMN_POSITION
----------- ---------------------- ---------- ---------------
DEMO015     SYS_AI_cvby0u1d95hd2   C1                       1
 
 
In order to prevent the same SQLs from being continually re-evaluated to see if an index might be preferable, the Automatic Indexing process puts previously evaluated SQLs on a type of blacklist and therefore don’t get subsequently re-evaluated.
 
So although the new clustering of the data within the table would now likely warrant the creation of a new index, if we just run the some SQLs as previously, nothing changes. No Automatic Index is created and the SQLs remain in their current “sub-optimal” state.
 
If we create a manual indexing on the column C1, will lead to an excellent clustering factor due to clustering of the data in the table.
 
ai_demo@PDB19> create index demo015_c1_idx
  2  on demo015(c1)
  3  nologging
  4  compress advanced low;
 
Index created.
 
ai_demo@PDB19> select table_name,index_name,column_name,column_position
  2  from user_ind_columns
  3  where table_name = 'DEMO015'
  4  order by 1,2,3;
 
TABLE_NAME  INDEX_NAME            COLUMN_NAM COLUMN_POSITION
----------- --------------------- ---------- ---------------
DEMO015     DEMO015_C1_IDX        C1                       1
DEMO015     SYS_AI_cvby0u1d95hd2  C1                       1
 
ai_demo@PDB19> select table_name,index_name,tablespace_name,clustering_factor,
  2          status,visibility,indexing,auto
  3  from  user_indexes
  4  where table_name = 'DEMO015'
  5  order by 1,2;
 
TABLE_NAME INDEX_NAME           TABLESPACE_NAME  CLUSTERING_FACTOR STATUS   VISIBILIT INDEXIN AUT
---------- -------------------- ---------------- ----------------- -------- --------- ------- ---
DEMO015    DEMO015_C1_IDX       TS_INDEX_DEMO               187035 VALID    VISIBLE   FULL    NO
DEMO015    SYS_AI_cvby0u1d95hd2 TS_INDEX_DEMO              4234490 UNUSABLE INVISIBLE FULL    YES
 
 
If we now re-run our original SQL statement that had been using a FTS execution plan and that we couldn’t make Automatic Indexing create a VALID index because when originally run, the data clustering was too poor within the table:
 
ai_demo@PDB19> set autotrace traceonly exp statistics
ai_demo@PDB19> select max(object_id), count(*) from demo015 where c1 = 37;
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3643747066
 
--------------------------------------------------------------------------------------------
| Id  | Operation                            | Name           | Rows  | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                |     1 |     7 |  2030   (1)|
|   1 |  SORT AGGREGATE                      |                |     1 |     7 |            |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| DEMO015        |   100K|   683K|  2030   (1)|
|*  3 |    INDEX RANGE SCAN                  | DEMO015_C1_IDX |   100K|       |   157   (1)|
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("C1"=37)
 
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       2027  consistent gets
        155  physical reads
      11824  redo size
        428  bytes sent via SQL*Net to client
        546  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
 
This query is now also finally using the newly created index, because the CBO now too deems it to be more efficient with an index based execution plan.
 
The moral of the story. AI is still blind to attribute clustering features.
 
 

No comments:

Post a Comment