Monday, November 18, 2019

Storage Indexes - Part I

Storage Indexes are the most useful Exadata feature. These are not the indexes that are stored in the database like Bitmap or B*Tree indexes, they are not capable of identifying set of records that has a certain value in a given column. Rather they are the feature of storage server software that is designed to eliminate the disk IO. That is because they identify the location where the requested records are not. They work by storing the minimum and maximum value of disk storage units (which are 1MB by default).
 
Because SQL Predicates are passed to storage servers when smart scans are performed, the storage software can check the prerequisite against the storage index metadata (maximum and minimum values) before doing the requested IO. So any storage region that cannot possibly have a matching row is skipped. In many cases this can result in significant reduction in the amount of IO that must be performed.
 
The ability to monitor the storage index is very limited. The optimizer doesn’t know whether a storage index is used for a particular sql statement. Not does AWR or ASH capture any information about usage of storage index for a particular sql statement. There is a single statistics that tracks the storage index usage at database level. The statistics cell physical IO bytes saved by storage index keep tracks of accumulated IO that has been avoided by the use of storage index.
 
c##rajesh@PDB1> select s.name,m.value
  2  from v$statname s ,
  3       v$mystat m
  4  where s.statistic# = m.statistic#
  5  and s.name in ('cell physical IO bytes saved by storage index',
  6  'cell physical IO interconnect bytes returned by smart scan' );
 
NAME                                                                    VALUE
------------------------------------------------------------ ----------------
cell physical IO bytes saved by storage index                               0
cell physical IO interconnect bytes returned by smart scan                  0
 
c##rajesh@PDB1> $ type script.sql
 
 
set termout off serveroutput off
alter session set statistics_level=all;
select * from big_table where owner ='JYU';
set termout on
 
select * from table( dbms_xplan.display_cursor(format=>'allstats last'));
 
 
c##rajesh@PDB1> @script.sql
 
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
SQL_ID  dhxc0c7g6bufm, child number 0
-------------------------------------
select * from big_table where owner ='JYU'
 
Plan hash value: 3993303771
 
----------------------------------------------------------------------------------------
| Id  | Operation                 | Name      | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |           |   7384 |00:00:00.35 |    2024K|   2024K|
|*  1 |  TABLE ACCESS STORAGE FULL| BIG_TABLE |   7384 |00:00:00.35 |    2024K|   2024K|
----------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - storage("OWNER"='JYU')
       filter("OWNER"='JYU')
 
 
19 rows selected.
 
c##rajesh@PDB1> select s.name,m.value
  2  from v$statname s ,
  3      v$mystat m
  4  where s.statistic# = m.statistic#
  5  and s.name in ('cell physical IO bytes saved by storage index',
  6  'cell physical IO interconnect bytes returned by smart scan' );
 
NAME                                                                    VALUE
------------------------------------------------------------ ----------------
cell physical IO bytes saved by storage index                     11338186752
cell physical IO interconnect bytes returned by smart scan            7679576
 
 
In this example, the query used storage index that eliminated about 11GB of disk IO. This is the amount of additional IO that would have been necessary without storage indexes.