In the previous post we discussed how automatic indexing (AI)
currently does not consider non-equality predicates, AI will index the column
based only on equality predicates. So how does Oracle handle the scenario when
a SQL got a mixture of both equality and non-equality predicates?
Will begin by creating two very similar
tables, but the second table having a more selective “sales_amt” column.
ai_demo@PDB19> create
table demo026a
2 nologging as
3 select
4 sysdate - mod(rownum,50000)+1 as sales_dt ,
5 mod(rownum,500)+1 as sales_amt,
6 rownum as sales_id,
7 substr(a.object_name,1,30) as customer_name
8 from stage a, stage b, stage c
9 where rownum <= 10000000 ;
Table created.
ai_demo@PDB19> create
table demo027a
2 nologging as
3 select sysdate - mod(rownum,50000)+1 as sales_dt ,
4 mod(rownum,250000)+1 as sales_amt,
5 rownum as sales_id,
6 substr(a.object_name,1,30) as customer_name
7 from stage a, stage b, stage c
8 where rownum <= 10000000 ;
Table created.
Table DEMO026A got 500 distinct values
for the column SALES_AMT whereas the other table DEMO027A got 250000 distinct
values, we will run a few identical SQL’s, which both use an equality predicate
on SALES_AMT column and non-equality predicate on SALES_DT column, the
SALES_AMT column provides some filtering but in combination with the SALES_DT
column, results in ultimate filtering with no rows returned:
ai_demo@PDB19> select *
from demo026a where sales_amt = 55
2 and sales_dt > to_date('30-jun-2022','dd-mon-yyyy') ;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 86708632
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 33 | 14305 (1)| 00:00:01 |
|* 1 | TABLE ACCESS STORAGE FULL| DEMO026A | 1 | 33 | 14305 (1)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
1 -
storage("SALES_DT">TO_DATE(' 2022-06-30 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "SALES_AMT"=55)
filter("SALES_DT">TO_DATE(' 2022-06-30 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "SALES_AMT"=55)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
52216 consistent gets
0 physical reads
0 redo size
443 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
ai_demo@PDB19> select *
from demo027a where sales_amt = 55
2 and sales_dt > to_date('30-jun-2022','dd-mon-yyyy') ;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 3781020016
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 34 | 14754 (1)| 00:00:01 |
|* 1 | TABLE ACCESS STORAGE FULL| DEMO027A | 1 | 34 | 14754 (1)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
1 - storage("SALES_AMT"=55 AND
"SALES_DT">TO_DATE(' 2022-06-30 00:00:00',
'syyyy-mm-dd hh24:mi:ss'))
filter("SALES_AMT"=55 AND "SALES_DT">TO_DATE(' 2022-06-30 00:00:00',
'syyyy-mm-dd hh24:mi:ss'))
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
53868 consistent gets
0 physical reads
0 redo size
443 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
So how does AI handle this scenario ? if
we look at the subsequent AI report:
SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates : 4
Indexes created (visible / invisible) : 2 (1 / 1)
Space used (visible / invisible) : 268.44 MB (134.22 MB / 134.22 MB)
Indexes dropped : 0
SQL statements verified : 2
SQL statements improved (improvement factor) : 1 (2155.2x)
SQL plan baselines created : 0
Overall improvement factor : 2x
-------------------------------------------------------------------------------
SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes : 0
Space used : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------
INDEX DETAILS
-------------------------------------------------------------------------------
1. The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
---------------------------------------------------------------------------------
| Owner | Table | Index | Key | Type | Properties |
---------------------------------------------------------------------------------
| AI_DEMO | DEMO026A | * SYS_AI_fadrntvnb5uww | SALES_AMT | B-TREE | NONE |
| AI_DEMO | DEMO027A | SYS_AI_g1acyfvut3dj2 | SALES_AMT | B-TREE | NONE |
---------------------------------------------------------------------------------
-------------------------------------------------------------------------------
VERIFICATION DETAILS
-------------------------------------------------------------------------------
1. The performance of the following statements improved:
-------------------------------------------------------------------------------
Parsing Schema Name : AI_DEMO
SQL ID : 54x1vd6qsxv75
SQL Text : select * from demo027a where sales_amt = 55 and
sales_dt > to_date('30-jun-2022','dd-mon-yyyy')
Improvement Factor : 2155.2x
Execution Statistics:
-----------------------------
Original Plan Auto Index Plan
---------------------------- ----------------------------
Elapsed Time (s): 7613599 1393
CPU Time (s): 2406857 824
Buffer Gets: 161643 43
Optimizer Cost: 14754 44
Disk Reads: 53858 2
Direct Writes: 0 0
Rows Processed: 0 0
Executions: 3 1
PLANS SECTION
---------------------------------------------------------------------------------------------
- Original
-----------------------------
Plan Hash Value : 3781020016
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 14754 | |
| 1 | TABLE ACCESS STORAGE FULL | DEMO027A | 1 | 34 | 14754 | 00:00:01 |
---------------------------------------------------------------------------------
- With Auto Indexes
-----------------------------
Plan Hash Value : 1442602725
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 34 | 44 | 00:00:01 |
| * 1 | TABLE ACCESS BY INDEX ROWID BATCHED | DEMO027A | 1 | 34 | 44 | 00:00:01 |
| * 2 | INDEX RANGE SCAN | SYS_AI_g1acyfvut3dj2 | 40 | | 3 | 00:00:01 |
-------------------------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
------------------------------------------
* 1 - filter("SALES_DT">TO_DATE(' 2022-06-30 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
* 2 - access("SALES_AMT"=55)
Notes
-----
- Dynamic sampling used for this statement ( level = 11 )
If we look at the definition of all
indexes currently on these tables:
ai_demo@PDB19> select
table_name,index_name,indexing,auto,
2 status,visibility,clustering_factor,leaf_blocks
3 from all_indexes
4 where table_name in ( 'DEMO027A','DEMO026A' )
5 and owner ='AI_DEMO';
TABLE_NAME INDEX_NAME INDEXIN AUT STATUS VISIBILIT CLUSTERING_FACTOR LEAF_BLOCKS
---------- ------------------------- ------- --- -------- --------- ----------------- -----------
DEMO026A SYS_AI_fadrntvnb5uww FULL YES UNUSABLE INVISIBLE 10000000 15362
DEMO027A SYS_AI_g1acyfvut3dj2 FULL YES VALID VISIBLE 10000000 15938
ai_demo@PDB19> select
table_name,index_name,column_name,column_position
2 from user_ind_columns
3 where table_name in ( 'DEMO027A','DEMO026A' )
4 order by 1,2;
TABLE_NAME INDEX_NAME COLUMN_NAM COLUMN_POSITION
---------- ------------------------- ---------- ---------------
DEMO026A SYS_AI_fadrntvnb5uww SALES_AMT 1
DEMO027A SYS_AI_g1acyfvut3dj2 SALES_AMT 1
In both these cases, AI only created
index on SALES_AMT column, as it was the only column with an equality predicate.
However the AI on the table DEMO026A remained in an INVISIBLE/UNUSABLE state,
that’s because the index on only the SALES_AMT column was not enough to improve the performance of
the SQL, due to the filtering not being sufficient enough and because of the
relative poor clustering factor.
The index on the table DEMO027A was
eventually created as VISIBLE/VALID state, as its better filtering was
sufficient to actually improve the performance of the SQL.
So if we re-run the first query,
ai_demo@PDB19> set
autotrace traceonly exp statistics
ai_demo@PDB19> select * from demo026a where sales_amt = 55
2 and sales_dt > to_date('30-jun-2022','dd-mon-yyyy') ;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 86708632
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 33 | 14305 (1)| 00:00:01 |
|* 1 | TABLE ACCESS STORAGE FULL| DEMO026A | 1 | 33 | 14305 (1)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
1 -
storage("SALES_DT">TO_DATE(' 2022-06-30 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "SALES_AMT"=55)
filter("SALES_DT">TO_DATE(' 2022-06-30 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "SALES_AMT"=55)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
52216 consistent gets
0 physical reads
0 redo size
443 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
It continues to use a Full table scan, if
we re-run the second query
ai_demo@PDB19> select *
from demo027a where sales_amt = 55
2 and sales_dt > to_date('30-jun-2022','dd-mon-yyyy') ;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 1442602725
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 34 | 43 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| DEMO027A | 1 | 34 | 43 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | SYS_AI_g1acyfvut3dj2 | 40 | | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
1 - filter("SALES_DT">TO_DATE('
2022-06-30 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
2 - access("SALES_AMT"=55)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
43 consistent gets
0
physical reads
0 redo size
443 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
It now uses the newly created AI, with an
improved consistent gets of 43, but if we were to create a manual index both (sales_amt
and sales_dt) the column
ai_demo@PDB19> create
index idx_demo027a_02
2 on demo027a( sales_amt,sales_dt )
3 compress advanced low;
Index created.
ai_demo@PDB19> set
autotrace traceonly exp statistics
ai_demo@PDB19> select * from demo027a where sales_amt = 55
2 and sales_dt > to_date('30-jun-2022','dd-mon-yyyy') ;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 1668119774
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 34 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| DEMO027A | 1 | 34 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_DEMO027A_02 | 1 | | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
2 - access("SALES_AMT"=55
AND "SALES_DT">TO_DATE(' 2022-06-30 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND
"SALES_DT" IS NOT NULL)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
3 consistent gets
2 physical reads
0 redo size
443 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
Performance improves significantly
further, by reducing consistent gets down by just 3. So if you have sql
statements with a mixture of both equality and non-equality predicates, then we
may encounter these two scenarios
A potentially efficient index that is not
created at all as the filtering on just the equality based predicates are not
sufficient to create viable index or
A potentially suboptimal AI that doesn’t
contain useful filtering columns, because they are used in non-equality
predicates.
2 nologging as
3 select
4 sysdate - mod(rownum,50000)+1 as sales_dt ,
5 mod(rownum,500)+1 as sales_amt,
6 rownum as sales_id,
7 substr(a.object_name,1,30) as customer_name
8 from stage a, stage b, stage c
9 where rownum <= 10000000 ;
2 nologging as
3 select sysdate - mod(rownum,50000)+1 as sales_dt ,
4 mod(rownum,250000)+1 as sales_amt,
5 rownum as sales_id,
6 substr(a.object_name,1,30) as customer_name
7 from stage a, stage b, stage c
8 where rownum <= 10000000 ;
2 and sales_dt > to_date('30-jun-2022','dd-mon-yyyy') ;
----------------------------------------------------------
Plan hash value: 86708632
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 33 | 14305 (1)| 00:00:01 |
|* 1 | TABLE ACCESS STORAGE FULL| DEMO026A | 1 | 33 | 14305 (1)| 00:00:01 |
--------------------------------------------------------------------------------------
---------------------------------------------------
hh24:mi:ss') AND "SALES_AMT"=55)
filter("SALES_DT">TO_DATE(' 2022-06-30 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "SALES_AMT"=55)
----------------------------------------------------------
0 recursive calls
0 db block gets
52216 consistent gets
0 physical reads
0 redo size
443 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
2 and sales_dt > to_date('30-jun-2022','dd-mon-yyyy') ;
----------------------------------------------------------
Plan hash value: 3781020016
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 34 | 14754 (1)| 00:00:01 |
|* 1 | TABLE ACCESS STORAGE FULL| DEMO027A | 1 | 34 | 14754 (1)| 00:00:01 |
--------------------------------------------------------------------------------------
---------------------------------------------------
'syyyy-mm-dd hh24:mi:ss'))
filter("SALES_AMT"=55 AND "SALES_DT">TO_DATE(' 2022-06-30 00:00:00',
'syyyy-mm-dd hh24:mi:ss'))
----------------------------------------------------------
0 recursive calls
0 db block gets
53868 consistent gets
0 physical reads
0 redo size
443 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
-------------------------------------------------------------------------------
Index candidates : 4
Indexes created (visible / invisible) : 2 (1 / 1)
Space used (visible / invisible) : 268.44 MB (134.22 MB / 134.22 MB)
Indexes dropped : 0
SQL statements verified : 2
SQL statements improved (improvement factor) : 1 (2155.2x)
SQL plan baselines created : 0
Overall improvement factor : 2x
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Unused indexes : 0
Space used : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
1. The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
---------------------------------------------------------------------------------
| Owner | Table | Index | Key | Type | Properties |
---------------------------------------------------------------------------------
| AI_DEMO | DEMO026A | * SYS_AI_fadrntvnb5uww | SALES_AMT | B-TREE | NONE |
| AI_DEMO | DEMO027A | SYS_AI_g1acyfvut3dj2 | SALES_AMT | B-TREE | NONE |
---------------------------------------------------------------------------------
-------------------------------------------------------------------------------
VERIFICATION DETAILS
-------------------------------------------------------------------------------
1. The performance of the following statements improved:
-------------------------------------------------------------------------------
Parsing Schema Name : AI_DEMO
SQL ID : 54x1vd6qsxv75
SQL Text : select * from demo027a where sales_amt = 55 and
sales_dt > to_date('30-jun-2022','dd-mon-yyyy')
Improvement Factor : 2155.2x
-----------------------------
Original Plan Auto Index Plan
---------------------------- ----------------------------
Elapsed Time (s): 7613599 1393
CPU Time (s): 2406857 824
Buffer Gets: 161643 43
Optimizer Cost: 14754 44
Disk Reads: 53858 2
Direct Writes: 0 0
Rows Processed: 0 0
Executions: 3 1
---------------------------------------------------------------------------------------------
-----------------------------
Plan Hash Value : 3781020016
| Id | Operation | Name | Rows | Bytes | Cost | Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 14754 | |
| 1 | TABLE ACCESS STORAGE FULL | DEMO027A | 1 | 34 | 14754 | 00:00:01 |
---------------------------------------------------------------------------------
-----------------------------
Plan Hash Value : 1442602725
| Id | Operation | Name | Rows | Bytes | Cost | Time |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 34 | 44 | 00:00:01 |
| * 1 | TABLE ACCESS BY INDEX ROWID BATCHED | DEMO027A | 1 | 34 | 44 | 00:00:01 |
| * 2 | INDEX RANGE SCAN | SYS_AI_g1acyfvut3dj2 | 40 | | 3 | 00:00:01 |
-------------------------------------------------------------------------------------------------------
------------------------------------------
* 1 - filter("SALES_DT">TO_DATE(' 2022-06-30 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
* 2 - access("SALES_AMT"=55)
-----
- Dynamic sampling used for this statement ( level = 11 )
2 status,visibility,clustering_factor,leaf_blocks
3 from all_indexes
4 where table_name in ( 'DEMO027A','DEMO026A' )
5 and owner ='AI_DEMO';
---------- ------------------------- ------- --- -------- --------- ----------------- -----------
DEMO026A SYS_AI_fadrntvnb5uww FULL YES UNUSABLE INVISIBLE 10000000 15362
DEMO027A SYS_AI_g1acyfvut3dj2 FULL YES VALID VISIBLE 10000000 15938
2 from user_ind_columns
3 where table_name in ( 'DEMO027A','DEMO026A' )
4 order by 1,2;
---------- ------------------------- ---------- ---------------
DEMO026A SYS_AI_fadrntvnb5uww SALES_AMT 1
DEMO027A SYS_AI_g1acyfvut3dj2 SALES_AMT 1
ai_demo@PDB19> select * from demo026a where sales_amt = 55
2 and sales_dt > to_date('30-jun-2022','dd-mon-yyyy') ;
----------------------------------------------------------
Plan hash value: 86708632
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 33 | 14305 (1)| 00:00:01 |
|* 1 | TABLE ACCESS STORAGE FULL| DEMO026A | 1 | 33 | 14305 (1)| 00:00:01 |
--------------------------------------------------------------------------------------
---------------------------------------------------
hh24:mi:ss') AND "SALES_AMT"=55)
filter("SALES_DT">TO_DATE(' 2022-06-30 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "SALES_AMT"=55)
----------------------------------------------------------
0 recursive calls
0 db block gets
52216 consistent gets
0 physical reads
0 redo size
443 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
2 and sales_dt > to_date('30-jun-2022','dd-mon-yyyy') ;
----------------------------------------------------------
Plan hash value: 1442602725
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 34 | 43 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| DEMO027A | 1 | 34 | 43 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | SYS_AI_g1acyfvut3dj2 | 40 | | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
---------------------------------------------------
2 - access("SALES_AMT"=55)
----------------------------------------------------------
0 recursive calls
0 db block gets
43 consistent gets
0 redo size
443 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
2 on demo027a( sales_amt,sales_dt )
3 compress advanced low;
ai_demo@PDB19> select * from demo027a where sales_amt = 55
2 and sales_dt > to_date('30-jun-2022','dd-mon-yyyy') ;
----------------------------------------------------------
Plan hash value: 1668119774
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 34 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| DEMO027A | 1 | 34 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_DEMO027A_02 | 1 | | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------
---------------------------------------------------
----------------------------------------------------------
1 recursive calls
0 db block gets
3 consistent gets
2 physical reads
0 redo size
443 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed