Sunday, March 27, 2022

Automatic Indexing - Part VIII

In the previous blog post on Automatic Indexing (AI) we discussed how Oracle will try to create as few indexes as possible to satisfy existing workloads even if that means, reordering the columns in an existing index.
 
To show how AI create few indexes as possible, will create the following table which has a number of column with different number of distinct values.
 
ai_demo@PDB19> create table DEMO008
  2  nologging as
  3  select rownum as id,
  4     ceil(rownum/10) x1,
  5     ceil(rownum/100) as x2,
  6     ceil(rownum/1000) as x3,
  7     ceil(rownum/10000) as x4,
  8         a.*
  9  from all_objects a,
 10         all_users, all_users
 11  where rownum <=10000000;
 
Table created.
 
ai_demo@PDB19> select column_name,num_distinct,num_nulls,density
  2  from user_tab_cols
  3  where table_name ='DEMO008'
  4  and ( column_name like 'X%'
  5      or column_name like 'ID' )
  6  order by column_name;
 
COLUMN_NAM NUM_DISTINCT  NUM_NULLS    DENSITY
---------- ------------ ---------- ----------
ID              9914368          0 1.0086E-07
X1              1001984          0 9.9802E-07
X2               100824          0 9.9183E-06
X3                10000          0      .0001
X4                 1000          0       .001
 
Then we run the following queries within 15-minute window between AI tasks.
 
select sum(object_id) from demo008 where id = 1;
select sum(object_id) from demo008 where x1 = 11;
select sum(object_id) from demo008 where x2 = 2;
select sum(object_id) from demo008 where x3 = 1;
select sum(object_id) from demo008 where x4 = 1;
select sum(object_id) from demo008 where x1 = 11 and x2 =2;
select sum(object_id) from demo008 where x1 = 11 and x3 =1;
select sum(object_id) from demo008 where x1 = 11 and x4 =1;
select sum(object_id) from demo008 where x2 = 2 and x1 =11;
select sum(object_id) from demo008 where x2 = 2 and x3 =1;
select sum(object_id) from demo008 where x2 = 2 and x4 =1;
select sum(object_id) from demo008 where x1 = 11 and x3 =1;
select sum(object_id) from demo008 where x2 = 2 and x3 =1;
select sum(object_id) from demo008 where x4 = 1 and x3 =1;
select sum(object_id) from demo008 where x4 = 1 and x1 =11;
select sum(object_id) from demo008 where x4 = 1 and x2 =2;
select sum(object_id) from demo008 where x4 = 1 and x3 =1;
select sum(object_id) from demo008 where x1 = 11 and x2 =2 and x3=1;
select sum(object_id) from demo008 where x1 = 11 and x2 =2 and x4=1;
select sum(object_id) from demo008 where x1 = 11 and x3 =1 and x4=1;
select sum(object_id) from demo008 where x2 = 2 and x1 = 11 and x3=1;
select sum(object_id) from demo008 where x2 = 2 and x1 = 11 and x4=1;
select sum(object_id) from demo008 where x2 = 2 and x3=1 and x4=1;
select sum(object_id) from demo008 where x3=1 and x1 =11 and x2 = 2 ;
select sum(object_id) from demo008 where x3=1 and x1 =11 and x4=1;
select sum(object_id) from demo008 where x3=1 and x2 =2 and x4=1;
select sum(object_id) from demo008 where x4=1 and x1 = 11 and x2 = 2 ;
select sum(object_id) from demo008 where x4=1 and x1 = 11 and x3 = 1 ;
select sum(object_id) from demo008 where x4=1 and x2 = 2 and x3 = 1 ;
select sum(object_id) from demo008 where x1 = 11 and x2 = 2 and x3 = 1 and x4=1 ;
 
each of these queries have no choice but to perform a Full table scan as there are currently no indexes defined on the table. Each queries uses a different column list so for the 30 or so sql statements one could potentially create 30 or so different indexes to cover each and every sql predicate combinations used above.
 
But how many different indexes will AI create? Let’s have a look...
 
ai_demo@PDB19> select dbms_auto_index.report_activity( activity_start=> systimestamp - 2/24 ) report from dual;
 
REPORT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
 Activity start               : 17-FEB-2022 08:32:20
 Activity end                 : 17-FEB-2022 10:32:20
 Executions completed         : 8
 Executions interrupted       : 0
 Executions with fatal error  : 0
-------------------------------------------------------------------------------
 
SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
 Index candidates                              : 10
 Indexes created (visible / invisible)         : 6 (6 / 0)
 Space used (visible / invisible)              : 956.3 MB (956.3 MB / 0 B)
 Indexes dropped                               : 0
 SQL statements verified                       : 28
 SQL statements improved (improvement factor)  : 27 (180223.4x)
 SQL plan baselines created (SQL statements)   : 1 (1)
 Overall improvement factor                    : 28073.2x
-------------------------------------------------------------------------------
 
We can see that AI only create 5 different indexes, that’s it.
 
If we look at the indexes that have been created.
 
INDEX DETAILS
-------------------------------------------------------------------------------
1. The following indexes were created:
-------------------------------------------------------------------------------
---------------------------------------------------------------------------------
| Owner   | Table   | Index                | Key          | Type   | Properties |
---------------------------------------------------------------------------------
| AI_DEMO | DEMO008 | SYS_AI_1gkabk3k69af0 | X2,X3,X4     | B-TREE | NONE       |
| AI_DEMO | DEMO008 | SYS_AI_7fr86wu528cxt | X1,X2,X3,X4  | B-TREE | NONE       |
| AI_DEMO | DEMO008 | SYS_AI_7mc4bggn5fp8c | X3,X1,X4     | B-TREE | NONE       |
| AI_DEMO | DEMO008 | SYS_AI_7tmnu4xtfndxs | ID           | B-TREE | NONE       |
| AI_DEMO | DEMO008 | SYS_AI_99ybv2202k8vc | X4,X1,X2     | B-TREE | NONE       |
| AI_DEMO | T       | SYS_AI_8fytzztsvcg09 | X11,X2,X3,X4 | B-TREE | NONE       |
---------------------------------------------------------------------------------
-------------------------------------------------------------------------------
 
We can see that how 5 indexes can collectively cover all 30 odd different SQL predicates within the workload. Because leading columns of at least one index has the necessary columns of each sql predicate.
 
If we look at but one sql example within the AI report, the query with the predicates on just the X2 column
 
-------------------------------------------------------------------------------
 Parsing Schema Name  : AI_DEMO
 SQL ID               : 77ry2fm49t6kr
 SQL Text             : select sum(object_id) from demo008 where x2 = 2
 Improvement Factor   : 207938.3x
 
Execution Statistics:
-----------------------------
                    Original Plan                 Auto Index Plan
                    ----------------------------  ----------------------------
 Elapsed Time (s):  122702                        246
 CPU Time (s):      70543                         246
 Buffer Gets:       623815                        6
 Optimizer Cost:    56703                         6
 Disk Reads:        623790                        0
 Direct Writes:     0                             0
 Rows Processed:    3                             1
 Executions:        3                             1
 
 
PLANS SECTION
---------------------------------------------------------------------------------------------
 
- Original
-----------------------------
 Plan Hash Value  : 2593440112
 
---------------------------------------------------------------------------------
| Id | Operation                    | Name    | Rows | Bytes | Cost  | Time     |
---------------------------------------------------------------------------------
|  0 | SELECT STATEMENT             |         |      |       | 56703 |          |
|  1 |   SORT AGGREGATE             |         |    1 |     9 |       |          |
|  2 |    TABLE ACCESS STORAGE FULL | DEMO008 |   99 |   891 | 56703 | 00:00:03 |
---------------------------------------------------------------------------------
 
- With Auto Indexes
-----------------------------
 Plan Hash Value  : 2265491416
 
--------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name                 | Rows | Bytes | Cost | Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                      |    1 |     9 |    6 | 00:00:01 |
|   1 |   SORT AGGREGATE                       |                      |    1 |     9 |      |          |
|   2 |    TABLE ACCESS BY INDEX ROWID BATCHED | DEMO008              |   33 |   297 |    6 | 00:00:01 |
| * 3 |     INDEX RANGE SCAN                   | SYS_AI_1gkabk3k69af0 |  100 |       |    3 | 00:00:01 |
--------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
------------------------------------------
* 3 - access("X2"=2)
 
 
We see it can now be serviced with the new SYS_AI_1gkabk3k69af0 index, because it has the following columns (X2,X3,X4) with the column X2 is the leading portion. 
 
If we look at the details of all the newly create AI – though we got 7 indexes only 5 was really build and used.
 
ai_demo@PDB19> select owner,index_name,tablespace_name,compression,
  2         status,visibility,indexing,auto
  3  from all_indexes
  4  where table_name ='DEMO008'
  5  and owner ='AI_DEMO';
 
OWNER      INDEX_NAME                TABLESPACE_NAME           COMPRESSION   STATUS   VISIBILIT INDEXIN AUT
---------- ------------------------- ------------------------- ------------- -------- --------- ------- ---
AI_DEMO    SYS_AI_7tmnu4xtfndxs      TS_INDEX_DEMO             ADVANCED LOW  VALID    VISIBLE   FULL    YES
AI_DEMO    SYS_AI_7fr86wu528cxt      TS_INDEX_DEMO             ADVANCED LOW  VALID    VISIBLE   FULL    YES
AI_DEMO    SYS_AI_1gkabk3k69af0      TS_INDEX_DEMO             ADVANCED LOW  VALID    VISIBLE   FULL    YES
AI_DEMO    SYS_AI_d7fz19jj13uwy      TS_INDEX_DEMO             ADVANCED LOW  UNUSABLE INVISIBLE FULL    YES
AI_DEMO    SYS_AI_7mc4bggn5fp8c      TS_INDEX_DEMO             ADVANCED LOW  VALID    VISIBLE   FULL    YES
AI_DEMO    SYS_AI_3s5z2bkc5ryna      TS_INDEX_DEMO             ADVANCED LOW  UNUSABLE INVISIBLE FULL    YES
AI_DEMO    SYS_AI_99ybv2202k8vc      TS_INDEX_DEMO             ADVANCED LOW  VALID    VISIBLE   FULL    YES
 
7 rows selected.
 
ai_demo@PDB19>
ai_demo@PDB19> select index_name,column_name,column_position
  2  from user_ind_columns
  3  where table_name ='DEMO008'
  4  order by 1,3;
 
INDEX_NAME                COLUMN_NAM COLUMN_POSITION
------------------------- ---------- ---------------
SYS_AI_1gkabk3k69af0      X2                       1
SYS_AI_1gkabk3k69af0      X3                       2
SYS_AI_1gkabk3k69af0      X4                       3
SYS_AI_3s5z2bkc5ryna      X3                       1
SYS_AI_3s5z2bkc5ryna      X4                       2
SYS_AI_7fr86wu528cxt      X1                       1
SYS_AI_7fr86wu528cxt      X2                       2
SYS_AI_7fr86wu528cxt      X3                       3
SYS_AI_7fr86wu528cxt      X4                       4
SYS_AI_7mc4bggn5fp8c      X3                       1
SYS_AI_7mc4bggn5fp8c      X1                       2
SYS_AI_7mc4bggn5fp8c      X4                       3
SYS_AI_7tmnu4xtfndxs      ID                       1
SYS_AI_99ybv2202k8vc      X4                       1
SYS_AI_99ybv2202k8vc      X1                       2
SYS_AI_99ybv2202k8vc      X2                       3
SYS_AI_d7fz19jj13uwy      X2                       1
SYS_AI_d7fz19jj13uwy      X4                       2
 
18 rows selected.
 
 
These 5 newly created AI are all valid and visible can collectively service all 30 odd different sql predicates of the captured workload.
 
The unusable indexes got these index key columns
 
ai_demo@PDB19> select i1.index_name,i1.auto,i1.status,i1.visibility,
  2      listagg(i2.column_name,',') within group(order by i2.column_position) index_keys
  3  from user_indexes i1 ,
  4      user_ind_columns i2
  5  where i1.table_name ='DEMO008'
  6  and i1.status ='UNUSABLE'
  7  and i1.index_name = i2.index_name
  8  and i1.table_name = i2.table_name
  9  group by i1.index_name,i1.auto,i1.status,i1.visibility
 10  order by i1.index_name ;
 
INDEX_NAME           AUT STATUS   VISIBILIT INDEX_KEYS
-------------------- --- -------- --------- --------------------
SYS_AI_3s5z2bkc5ryna YES UNUSABLE INVISIBLE X3,X4
SYS_AI_d7fz19jj13uwy YES UNUSABLE INVISIBLE X2,X4
 
Though we got index on X2, X4 and X3, X4 as invisible, the existing index can service the following predicates
 
Where X2 = ?? and X4 = ??
Where X3 = ?? and X4 = ??
 
ai_demo@PDB19> set autotrace traceonly exp
ai_demo@PDB19> select sum(object_id) from demo008 where x2 = 2 and x4 =1;
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2265491416
 
-------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                      |     1 |    13 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE                      |                      |     1 |    13 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| DEMO008              |   100 |  1300 |     6   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | SYS_AI_1gkabk3k69af0 |    99 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("X2"=2 AND "X4"=1)
       filter("X4"=1)
 
ai_demo@PDB19> select sum(object_id) from demo008 where x4 = 1 and x3 =1;
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2622279452
 
-------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                      |     1 |    12 |     7   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE                      |                      |     1 |    12 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| DEMO008              |   100 |  1200 |     7   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | SYS_AI_7mc4bggn5fp8c |   100 |       |     4   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("X3"=1 AND "X4"=1)
       filter("X4"=1)
 
ai_demo@PDB19> set autotrace off
ai_demo@PDB19> select index_name,column_name,column_position
  2  from user_ind_columns
  3  where index_name in ( 'SYS_AI_1gkabk3k69af0','SYS_AI_7mc4bggn5fp8c' )
  4  order by 1,3;
 
INDEX_NAME           COLUMN_NAME          COLUMN_POSITION
-------------------- -------------------- ---------------
SYS_AI_1gkabk3k69af0 X2                                 1
SYS_AI_1gkabk3k69af0 X3                                 2
SYS_AI_1gkabk3k69af0 X4                                 3
SYS_AI_7mc4bggn5fp8c X3                                 1
SYS_AI_7mc4bggn5fp8c X1                                 2
SYS_AI_7mc4bggn5fp8c X4                                 3
 
6 rows selected.
 
 
From the experience that many DBA and developers out there would create many more than just these 5 indexes, partly because it’s just easier to create new index for each sql predicates that doesn’t currently have an approximate index and partly because its not always easy to capture and know all the sql predicate combinations might be used by an application.
 
This is one of the really nice capabilities of Automatic Indexing, in that it tries to service the known workloads it captures with as few indexes as possible, that have all be proven first to indeed improve SQL performance

No comments:

Post a Comment