Sunday, October 16, 2022

Automatic indexing - Not supported - Part III

In the previous blogpost, discussed scenario where automatic indexing does not currently create automatic indexing and we may need to manually create necessary indexes.
 
In this post, we will discuss another scenario where AI will create an index, but we may need to manually create an even better one.
 
We will start by creating a large table 
 
ai_demo@ATP21C> create table demo035
  2  nologging as
  3  select rownum as id,
  4    mod(rownum,100000)+1 as x1,
  5    cast('Hello_world' as varchar2(20)) as x2,
  6    a.*
  7  from stage a, stage b
  8  where rownum <= 10000000;
 
Table created.
 
The main columns to note here are X1 - which contains 100K distinct values – and X2 that got only the same value “Hello_World” all together.
 
I will next run the following query a number of times..
 
ai_demo@ATP21C> select x2,x1 from demo035 where x1 = 42;
 
100 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3123852742
 
-------------------------------------------------------------------------------------
| Id  | Operation                 | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |         |    98 |  1666 | 52048   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS STORAGE FULL| DEMO035 |    98 |  1666 | 52048   (1)| 00:00:03 |
-------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - storage("X1"=42)
       filter("X1"=42)
 
Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     191109  consistent gets
     191103  physical reads
          0  redo size
        964  bytes sent via SQL*Net to client
         48  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed
 
 
Without an index, the CBO has no choice but to perform a Full table scan. An index on X1 would provide the necessary filtering to fetch and return the required rows.
 
But if this query was important enough, we could improve thing further by “overloading” the column X2 into the same index. So we could use the index exclusively to get all the necessary data, without having to access the table at all. Considering an index on just the X1 column would need to fetch a reasonable number of rows (100 rows) and would need to visit a substantial number of different table block due to its clustering, overloading the index in this scenario would substantially reduce the necessary workloads of this query.
 
So what does AI do in this scenario, is overloading an index considered?
 
ai_demo@ATP21C> select dbms_auto_index.report_activity(systimestamp -1/24, systimestamp,'TEXT','ALL','ALL') report from dual;
 
REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
 Activity start               : 12-AUG-2022 09:34:09
 Activity end                 : 12-AUG-2022 10:34:09
 Executions completed         : 4
 Executions interrupted       : 0
 Executions with fatal error  : 0
-------------------------------------------------------------------------------
 
SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
 Index candidates                              : 1
 Indexes created (visible / invisible)         : 1 (1 / 0)
 Space used (visible / invisible)              : 134.22 MB (134.22 MB / 0 B)
 Indexes dropped                               : 0
 SQL statements verified                       : 1
 SQL statements improved (improvement factor)  : 1 (191348x)
 SQL plan baselines created                    : 0
 Overall improvement factor                    : 191348x
-------------------------------------------------------------------------------
 
SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
 Unused indexes    : 0
 Space used        : 0 B
 Unusable indexes  : 0
-------------------------------------------------------------------------------
 
INDEX DETAILS
-------------------------------------------------------------------------------
1. The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
------------------------------------------------------------------------
| Owner   | Table   | Index                | Key | Type   | Properties |
------------------------------------------------------------------------
| AI_DEMO | DEMO035 | SYS_AI_2v4tz32vgabbg | X1  | B-TREE | NONE       |
------------------------------------------------------------------------
-------------------------------------------------------------------------------
 
VERIFICATION DETAILS
-------------------------------------------------------------------------------
1. The performance of the following statements improved:
-------------------------------------------------------------------------------
 Parsing Schema Name  : AI_DEMO
 SQL ID               : 6zrhtjvmmvdz2
 SQL Text             : select x2,x1 from demo035 where x1 = 42
 Improvement Factor   : 191348x
 
Execution Statistics:
-----------------------------
                    Original Plan                 Auto Index Plan
                    ----------------------------  ----------------------------
 Elapsed Time (s):  482991                        53219
 CPU Time (s):      94369                         6075
 Buffer Gets:       574045                        103
 Optimizer Cost:    52048                         102
 Disk Reads:        573316                        102
 Direct Writes:     0                             0
 Rows Processed:    300                           100
 Executions:        3                             1
 
 
PLANS SECTION
---------------------------------------------------------------------------------------------
 
- Original
-----------------------------
 Plan Hash Value  : 3123852742
 
--------------------------------------------------------------------------------
| Id | Operation                   | Name    | Rows | Bytes | Cost  | Time     |
--------------------------------------------------------------------------------
|  0 | SELECT STATEMENT            |         |      |       | 52048 |          |
|  1 |   TABLE ACCESS STORAGE FULL | DEMO035 |   98 |  1666 | 52048 | 00:00:03 |
--------------------------------------------------------------------------------
 
Notes
-----
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no
 
 
- With Auto Indexes
-----------------------------
 Plan Hash Value  : 1691939753
 
-------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                 | Rows | Bytes | Cost | Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                      |   98 |  1666 |  102 | 00:00:01 |
|   1 |   TABLE ACCESS BY INDEX ROWID BATCHED | DEMO035              |   98 |  1666 |  102 | 00:00:01 |
| * 2 |    INDEX RANGE SCAN                   | SYS_AI_2v4tz32vgabbg |   98 |       |    3 | 00:00:01 |
-------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("X1"=42)
 
 
Notes
-----
- Dynamic sampling used for this statement ( level = 11 )
 
 
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
 
We see that automatic index on just the X1 column was created.
 
 
ai_demo@ATP21C> select index_name,index_type,indexing,auto,
  2         status,visibility,num_rows,leaf_blocks,clustering_factor
  3  from all_indexes
  4  where table_name = 'DEMO035'
  5  and owner ='AI_DEMO';
 
INDEX_NAME            INDEX_TYPE  INDEXIN AUT STATUS   VISIBILIT   NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
--------------------- ----------- ------- --- -------- --------- ---------- ----------- -----------------
SYS_AI_2v4tz32vgabbg  NORMAL      FULL    YES VALID    VISIBLE     10000000       15512          10000000
 
ai_demo@ATP21C> select table_name,index_name,column_name,column_position
  2  from all_ind_columns
  3  where table_name = 'DEMO035'
  4  order by 1,2;
 
TABLE_NAME INDEX_NAME            COLUMN_NAME COLUMN_POSITION
---------- --------------------- ----------- ---------------
DEMO035    SYS_AI_2v4tz32vgabbg  X1                        1
 
 
If we now re-run the query again
 
ai_demo@ATP21C> select x2,x1 from demo035 where x1 = 42;
 
100 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 1691939753
 
------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |    98 |  1666 |   102   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| DEMO035              |    98 |  1666 |   102   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | SYS_AI_2v4tz32vgabbg |    98 |       |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("X1"=42)
 
Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        104  consistent gets
          0  physical reads
          0  redo size
        964  bytes sent via SQL*Net to client
         48  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed
 
 
The query uses the newly create automatic index, but with 104 consistent gets, it is still doing substantial amount of work here, if we manually create another index that overloads the only other columns required in this query
 
ai_demo@ATP21C> create index demo035_idx
  2  on demo035( x1,x2 )
  3  nologging
  4  compress advanced low;
 
Index created.
 
Using compress advanced low as used by automatic index, noting the X2 contain the same value for all the rows, making it particularly perfect for compression and the best-case scenario when it comes to minimal overheads potentially associated with overloading this index
 
ai_demo@ATP21C> select index_name,index_type,indexing,auto,
  2         status,visibility,num_rows,leaf_blocks,clustering_factor
  3  from all_indexes
  4  where table_name = 'DEMO035'
  5  and owner ='AI_DEMO';
 
INDEX_NAME           INDEX_TYPE INDEXIN AUT STATUS VISIBILIT   NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- ---------- ------- --- ------ --------- ---------- ----------- -----------------
SYS_AI_2v4tz32vgabbg NORMAL     FULL    YES VALID  VISIBLE     10000000       15512          10000000
DEMO035_IDX          NORMAL     FULL    NO  VALID  VISIBLE     10000000       16667          10000000
 
So giving AI the best possible scenario, in which it could potentially create an overloaded index. But I have never been able to get AI to create overloaded index. Only columns in the filtering predicate are considered for inclusion in automatic indexes.
 
If we re-run the query again.
 
ai_demo@ATP21C> select x2,x1 from demo035 where x1 = 42;
 
100 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2152906945
 
--------------------------------------------------------------------------------
| Id  | Operation        | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |             |    98 |  1666 |     3   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| DEMO035_IDX |    98 |  1666 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - access("X1"=42)
 
Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          0  physical reads
          0  redo size
        964  bytes sent via SQL*Net to client
         48  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed
 
We notice the CBO now uses the manually created index without any table access path, as it can just use the index to access the necessary data. The number of consistent gets was significantly reduced to 5, a fraction of the previous 104 when the automatic index was used.
 
So the scenario of an overloaded index that could significantly reduce database resource, which is currently not supported by AI, is another example of where many want to manually create necessary index.
 

Monday, September 26, 2022

Automatic indexing - Not supported - Part II

In the pervious post, we discussed how AI process does not create auto indexing in the scenario where minimum or maximum of a column is required.
 
Another scenario when an AI is not created is when we hit the issues associated with the missing index on a foreign key constraint.
 
Deleting a record from the parent table without index on the associate foreign key column of the child table, we hit a number of issues including having to perform a full table scan on the child tables and the associated locking problems.
 
To illustrate, we will first create a parent table.
 
ai_demo@ATP21C> create table demo035p
  2  nologging
  3  as
  4  select rownum as id , a.*
  5  from stage a
  6  where rownum <= 10000;
 
Table created.
 
And then a larger child table, with no index on the associated foreign key constraint.
 
ai_demo@ATP21C> create table demo035c
  2  nologging
  3  as
  4  select rownum as pid ,
  5     mod(rownum,1000)+1000 as cid ,
  6     a.*
  7  from stage a, stage b, stage c
  8  where rownum <= 10000000 ;
 
Table created.
 
ai_demo@ATP21C> alter table demo035p
  2  add constraint pk_demo035p
  3  primary key(id);
 
Table altered.
 
ai_demo@ATP21C> alter table demo035c
  2  add constraint fk_demo035c
  3  foreign key(cid)
  4  references demo035p;
 
Table altered.
 
If we decide to delete rows from the parent table
 
ai_demo@ATP21C> set autotrace traceonly exp statistics
ai_demo@ATP21C> delete from demo035p where id = 55;
 
1 row deleted.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 4215583649
 
----------------------------------------------------------------------------------
| Id  | Operation          | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | DELETE STATEMENT   |             |     1 |     4 |     1   (0)| 00:00:01 |
|   1 |  DELETE            | DEMO035P    |       |       |            |          |
|*  2 |   INDEX UNIQUE SCAN| PK_DEMO035P |     1 |     4 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("ID"=55)
 
Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1 because of no expensive parallel operation
   - PDML disabled because child referential constraints are present
 
 
Statistics
----------------------------------------------------------
         55  recursive calls
         12  db block gets
     164019  consistent gets
          6  physical reads
       2596  redo size
        340  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
We notice that even thought we delete one row from the parent table, we perform a large number of consistent gets due to a necessary full table scan on the child table, as Oracle is forced to check the table for any possible FK violations. Without an index on child table foreign key column, Oracle has no choice but to perform the relative expensive full table scan.
 
Additionally, if we have an existing transaction on the child table
 
ai_demo@ATP21C> @sid
 
INST_ SESSION_ID
----- ----------
1          54092
 
ai_demo@ATP21C> insert into demo035c(pid,cid) values(10000001,1055);
 
1 row created.
 
 
And then in another session attempt to delete a row from the parent table (from a different session)
 
ai_demo@ATP21C> @sid
 
INST_ SESSION_ID
----- ----------
1          12038
 
ai_demo@ATP21C> delete from demo035p where id = 56;
 
that delete statement goes into a block state, due to uncommitted transaction in the child table, which can lead to further locking issues in other sessions.
 
ai_demo@ATP21C> select l.object_id,u.object_name,l.session_id,l.locked_mode
  2  from v$locked_object l,
  3      user_objects u
  4  where l.object_id = u.object_id;
 
 OBJECT_ID OBJECT_NAM SESSION_ID LOCKED_MODE
---------- ---------- ---------- -----------
    136548 DEMO035P        12038           3
    136548 DEMO035P        54092           3
    136549 DEMO035C        54092           3
 
ai_demo@ATP21C> select sid,blocking_session_status, blocking_session,event
  2  from gv$session
  3  where sid = 12038;
 
       SID BLOCKING_SE BLOCKING_SESSION EVENT
---------- ----------- ---------------- -----------------------------------
     12038 VALID                  54092 enq: TX - row lock contention
 
 
The full table scan on the child table and those associated locks can all be avoided by having an index on the FK constraint column, as these indexes can then be used to effectively police the constraint during such delete operation.
 
So what does AI does in this scenario? Currently it does nothing. In the Oracle database 21c(21.3) currently we can see an index in unusable/invisible
 
ai_demo@ATP21C> select index_name,index_type,indexing,auto,
  2         status,visibility
  3  from all_indexes
  4  where table_name in ('DEMO035P','DEMO035C')
  5  and owner ='AI_DEMO';
 
INDEX_NAME                INDEX_TYPE INDEXIN AUT STATUS   VISIBILIT
------------------------- ---------- ------- --- -------- ---------
SYS_AI_7zysyhnb663y9      NORMAL     FULL    YES UNUSABLE INVISIBLE
PK_DEMO035P               NORMAL     FULL    NO  VALID    VISIBLE  
 
ai_demo@ATP21C> select table_name,index_name,column_name,column_position
  2  from all_ind_columns
  3  where table_name in ('DEMO035P','DEMO035C')
  4  order by 1,2;
 
TABLE_NAME INDEX_NAME                COLUMN_NAME     COLUMN_POSITION
---------- ------------------------- --------------- ---------------
DEMO035C   SYS_AI_7zysyhnb663y9      CID                           1
DEMO035P   PK_DEMO035P               ID                            1
 
So you may need to manually create such an index on the FK constraint to improve the performance and eliminate the locking issues.
 
ai_demo@ATP21C> create index idx_demo035c on demo035c(cid);
 
Index created.
 
ai_demo@ATP21C> set autotrace traceonly exp statistics
ai_demo@ATP21C> delete from demo035p where id = 56;
 
1 row deleted.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 4215583649
 
----------------------------------------------------------------------------------
| Id  | Operation          | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | DELETE STATEMENT   |             |     1 |     4 |     1   (0)| 00:00:01 |
|   1 |  DELETE            | DEMO035P    |       |       |            |          |
|*  2 |   INDEX UNIQUE SCAN| PK_DEMO035P |     1 |     4 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("ID"=56)
 
Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1 because of no expensive parallel operation
   - PDML disabled because child referential constraints are present
 
 
Statistics
----------------------------------------------------------
         22  recursive calls
          7  db block gets
         26  consistent gets
          6  physical reads
       1276  redo size
        340  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
ai_demo@ATP21C> set autotrace off
 
 
consistent gets were dropped massively as Oracle can now use the index to avoid the FTS search on the child table. The associated locking issues are eliminated as well.