THE RAJESH BLOG: Automatic Indexing

In the pervious blog post on Automatic Indexing (AI) we discussed how the default index column order is column id, the order in which the columns are defined in the table, in this blog post we will discuss about the effect of index compression.

By default AI does not use index compression, however if we got access to advanced compression option, then we have the choice to turn on index compression in the following manner.

demo@PDB19> exec dbms_auto_index.configure('AUTO_INDEX_COMPRESSION','ON');

PL/SQL procedure successfully completed.

In this blog post, we will see how index column order is still critical to ensure effective compression on index segments. We begin by creating a simple table that has four columns X1,X2,X3 and X4. They are defined in this order within the table. But column X1 got most distinct values and X2 got 1000 distinct values and X3 got 10 distinct and X4 got only one distinct value.

ai_demo@PDB19> create table DEMO006
2 nologging as
3 select rownum as id,
4      mod(rownum,5000000)+1 as x1,
5      mod(rownum,1000)+1 as x2,
6      mod(rownum,10)+1 as x3,
7      'COMPLETED' as x4, a.*
8 from all_objects a,
9      all_users, all_users
10 where rownum <=10000000;

Table created.

In terms of being most efficient from the compression perspective, it would be better to have the index defined in as X4, X3, X2, X1 order, so the leading column in the index have the most duplicate values that enable effective deduplication and hence the index compression. Having X1 column in the leading edge, however with some many distinct values would effectively make the index non-compressible as the leading column would have too many distinct values to benefit from such compression.

So how does AI handle this scenario, does it keep the same default index column order or does it alter the index column order to provider better index compression benefits

Let’s run the following sql with all four predicates and see what AI creates…

ai_demo@PDB19> select count(*) from demo006
2 where x1 = 42 and x2 =42
3 and x3 = 4 and x4 ='COMPLETED';

COUNT(*)
----------
         0

ai_demo@PDB19> select count(*) from demo006
2 where x1 = 42 and x2 =42
3 and x3 = 4 and x4 ='COMPLETED';

COUNT(*)
----------
         0

If we look at the AI report for the period in which the above sql was run,

INDEX DETAILS
-------------------------------------------------------------------------------
1. The following indexes were created:
-------------------------------------------------------------------------------
---------------------------------------------------------------------------------
| Owner   | Table    | Index                | Key         | Type   | Properties |
---------------------------------------------------------------------------------
| AI_DEMO | DEMO006 | SYS_AI_6h7mffwc7pbd5 | X1,X2,X3,X4 | B-TREE | NONE       |
---------------------------------------------------------------------------------
-------------------------------------------------------------------------------

We could see that AI has created the index with all four columns from the sql predicate in again the default column order as the column order as defined in the table (X1, X2, X3, X4), even though AI compression was enabled, it hasn’t considered the column cardinalities in its determination of best index column order.

AI has the tendency to index all columns specified in the SQL predicates, regardless of whether all such column provide filtering benefits and do not consider the best column order from a compression perspective when determining index column order.

If we look at the definition and the size of the resultant AI

ai_demo@PDB19> select index_name,column_name,column_position
2 from user_ind_columns
3 where table_name ='DEMO006'
4 order by 1,3;

INDEX_NAME           COLUMN_NAM COLUMN_POSITION
-------------------- ---------- ---------------
SYS_AI_6h7mffwc7pbd5 X1                       1
SYS_AI_6h7mffwc7pbd5 X2                       2
SYS_AI_6h7mffwc7pbd5 X3                       3
SYS_AI_6h7mffwc7pbd5 X4                       4

ai_demo@PDB19> select owner,index_name,tablespace_name,compression,
2          status,visibility,indexing,auto
3 from all_indexes
4 where table_name ='DEMO006'
5 and owner ='AI_DEMO';

OWNER   INDEX_NAME           TABLESPACE_NAME COMPRESSION   STATUS   VISIBILIT INDEXIN AUT
------- -------------------- --------------- ------------- -------- --------- ------- ---
AI_DEMO SYS_AI_6h7mffwc7pbd5 TS_INDEX_DEMO   ADVANCED LOW VALID    VISIBLE   FULL    YES

If we were to create the index manually in the most appropriate manner from a compression perspective, with the index columns defined in reversal order and also with another index without the redundant X4 column.

ai_demo@PDB19> create index demo006_idx1
2 on demo006( x4,x3,x2,x1 )
3 nologging
4 compress advanced low ;

Index created.

ai_demo@PDB19> create index demo006_idx2
2 on demo006( x3,x2,x1 )
3 nologging
4 compress advanced low ;

Index created.

ai_demo@PDB19> select owner,index_name,leaf_blocks,
2             compression,status,auto
3 from all_indexes
4 where table_name ='DEMO006'
5 and owner ='AI_DEMO';

OWNER      INDEX_NAME                LEAF_BLOCKS COMPRESSION   STATUS   AUT
---------- ------------------------- ----------- ------------- -------- ---
AI_DEMO    SYS_AI_6h7mffwc7pbd5            35451 ADVANCED LOW VALID    YES
AI_DEMO    DEMO006_IDX1                    23509 ADVANCED LOW VALID    NO
AI_DEMO    DEMO006_IDX2                    23462 ADVANCED LOW VALID    NO

We notice the resulting (manually created) indexes were subsequently smaller than AI index at just 23509 and 23462 leaf blocks respectively.

Sunday, March 13, 2022

Automatic Indexing - Part VI

No comments:

Post a Comment