Compressing an index can be an excellent way to permanently
reduce the size of an index in a very cost effective manner. Index entries with
many duplicate values (or duplicate leading columns within the index) can be
“compressed” by Oracle to reduce both storage overheads and potentially access
overheads for large index scans. Oracle basically de-duplicates repeated
indexed column values within each individual leaf block by storing each unique
occurrence in a prefix section within the block.
But it’s important to compress the right indexes in the
right manner. If indexes do not have enough repeated data, it’s quite possible
to make certain indexes larger rather than smaller when using compression (as
the overheads of having the prefix section in the index block outweighs the
benefits of limited reduction of repeated values). So one needs to be very
selective on which indexes to compress and take care to compress the correct
number of columns within the index. Oracle will only protect you from yourself
if you attempt to compress all columns in a unique index, as in this scenario
there can be no duplicate values to compress.
So, wouldn’t it be nice if Oracle made it all a lot easier
for us and automatically decided which indexes to compress, which columns
within the index to compress and which indexes to simply not bother compressing
at all. Additionally, rather than an all or nothing approach in which all index
leaf blocks are compressed in the same manner, wouldn’t it be nice if Oracle
decided for each and every individual leaf block within the index how to best
compress it. For those index leaf block that have no duplicate entries, do
nothing, for those with some repeated columns just compress them and for those
leaf blocks with lots of repeated columns and values to compress all of them as
efficiently as possible.
All these are being addressed by the recent release of
Oracle database 12.1.0.2, one of the really cool new feature got introduced was
Advanced Index Compression
rajesh@PDB1> create table t
2 nologging
3 as
4 select
owner,object_name,subobject_name,object_id,
5
data_object_id,object_type,created,last_ddl_time,
6
timestamp,status,temporary,generated,secondary,
7
namespace,edition_name,sharing,editionable,
8 oracle_maintained, case when id between
9 0 and 250000 then 0
10 else id end id
11 from big_table;
Table created.
So, I have fabricated the data such that values in the ID column are
effectively unique within 75% of the table but the other 25% consist of
repeated values.
If we create a normal B-Tree index without compression:
rajesh@PDB1> create index t_idx on t(id) nologging ;
Index created.
rajesh@PDB1> column index_name format a10
rajesh@PDB1> select index_name,compression,leaf_blocks
2 from user_indexes
3 where index_name ='T_IDX';
INDEX_NAME COMPRESSION
LEAF_BLOCKS
---------- ------------- -----------
T_IDX DISABLED 2124
1 row selected.
rajesh@PDB1>
We noticed that the index consist of 2124 leaf blocks.
If we try to use normal compression on the index:
rajesh@PDB1> alter index t_idx rebuild compress nologging;
Index altered.
rajesh@PDB1> select index_name,compression,leaf_blocks
2 from user_indexes
3 where index_name ='T_IDX';
INDEX_NAME COMPRESSION
LEAF_BLOCKS
---------- ------------- -----------
T_IDX ENABLED 2685
1 row selected.
rajesh@PDB1>
We notice that the compressed index rather than decrease in size has
actually increased in size, up to 2685
leaf blocks. So the index has grown by some 25% due to the fact the index
predominately contains unique values which don’t compress at all and the resultant
prefix section in the leaf blocks becomes nothing more than additional
overhead. The 25% section of the index containing all the repeated values has
indeed compressed effectively but these savings are more than offset by the
increase in size associated with the other 75% of the index where the index
entries had no duplication.
However, if we use the new advanced index compression capability via
the COMPRESS ADVANCED LOW clause:
rajesh@PDB1> alter index t_idx rebuild compress advanced low
nologging;
Index altered.
rajesh@PDB1>
rajesh@PDB1> select index_name,compression,leaf_blocks
2 from user_indexes
3 where index_name ='T_IDX';
INDEX_NAME COMPRESSION
LEAF_BLOCKS
---------- ------------- -----------
T_IDX ADVANCED
LOW 2054
1 row selected.
rajesh@PDB1>
We notice the index has now indeed decreased in size from the original
2124 leaf blocks down to 2054. Oracle has effectively ignored all those leaf
blocks where compression wasn’t viable and compressed just the 25% of the index
where compression was effective. Obviously, the larger the key values
(remembering the rowids associated with the index entries can’t be compressed)
and the larger the percentage of repeated data, the larger the overall
compression returns.
Advanced Index Compression enables Oracle to automatically just
compress those index leaf blocks where compression is beneficial.
If we look at a couple of partial block dumps from this index, first a
dump from a leaf block that did have duplicate index entries
rajesh@PDB1> column header_file new_val hf
rajesh@PDB1> column header_block new_val hb
rajesh@PDB1>
rajesh@PDB1> select file_id as header_file , block_id +5 as
header_block
2 from dba_extents
3 where segment_name ='T_IDX'
4 and owner = user
5 and extent_id = 0 ;
HEADER_FILE HEADER_BLOCK
----------- ------------
12 17197
1 row selected.
rajesh@PDB1>
rajesh@PDB1> alter system flush buffer_cache;
System altered.
rajesh@PDB1> @tkfilename.sql
D:\APP\179818\diag\rdbms\cdb1\cdb1\trace\cdb1_ora_5384.trc
rajesh@PDB1> alter system dump datafile &hf block &hb
;
old 1: alter system dump
datafile &hf block &hb
new 1: alter system dump
datafile 12 block 17197
System altered.
rajesh@PDB1> exit
Disconnected from Oracle Database 12c Enterprise Edition Release
12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real
Application Testing options
Leaf block dump
===============
header address 408388196=0x18578264
kdxcolev 0
KDXCOLEV Flags = - - -
kdxcolok 0
kdxcoopc 0xa0: opcode=0: iot flags=-C- is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 652
kdxcofbo 1348=0x544
kdxcofeo 2164=0x874
kdxcoavs 816
kdxlespl 0
kdxlende 0
kdxlenxt 50348846=0x300432e
kdxleprv 50348844=0x300432c
kdxledsz 0
kdxlebksz 8036
kdxlepnro 1
kdxlepnco 1 (Adaptive)
prefix row#0[8032] flag: -P-----, lock: 0, len=4
col 0; len 1; (1):
80
prc 652
row#0[8023] flag: -------, lock: 0, len=9
col 0; len 6; (6): 03 00
41 96 00 04
psno 0
row#1[8014] flag: -------, lock: 0, len=9
col 0; len 6; (6): 03 00
41 96 00 05
psno 0
row#2[8005] flag: -------, lock: 0, len=9
col 0; len 6; (6): 03 00
41 96 00 06
psno 0
row#3[7996] flag: -------, lock: 0, len=9
col 0; len 6; (6): 03 00
41 96 00 07
psno 0
row#4[7987] flag: -------, lock: 0, len=9
col 0; len 6; (6): 03 00
41 96 00 08
psno 0
. . . . . . . .
. . . . . . . .
row#650[2173] flag: -------, lock: 0, len=9
col 0; len 6; (6): 03 00
41 a2 00 10
psno 0
row#651[2164] flag: -------, lock: 0, len=9
col 0; len 6; (6): 03 00
41 a2 00 11
psno 0
----- end of leaf block Logical dump -----
----- end of leaf block dump -----
End dump data blocks tsn: 4 file#: 12 minblk 17197 maxblk 17197
The red section is a portion of the
index header that determines the number of rows in the prefix table of the
index (kdxlepnro 1). The prefix table basically
lists all the distinct column values in the leaf blocks that are to be
compressed. The value 1 denotes there is actually only just the 1 distinct
column value in this specific leaf block (i.e. all index entries have the same
indexed value). This section also denotes how many of the indexed columns are
to be compressed (kdxlepnco 1). As this index
only has the one column, it also has a value of 1. Note this value can potentially
be anything between 0 (no columns compressed) up to the number of columns in
the index. The (Adaptive)
reference tells us that Index Advanced Compression has been used and that the
values here can change from leaf block to leaf block depending on the data
characteristics of the index entries within each leaf block (a dump of a basic
compressed index will not have the “Adaptive” reference).
The blue section is the compression
prefix table and details all the unique combinations of index entries to be
compressed within the leaf block. As all indexed values are the same in this
index (value 0, internally represented as 80 hex), the prefix table only has
the one row. prc 652 denotes that all 652 index entries in this leaf block have
this specific indexed value
Next follows all the actual index entries, which now only consist of
the rowid (the 6 byte col 0 column) as they all reference psno 0, which is the unique row id of the only row
within the prefix table (row#0).
So rather than storing the indexed value 652 times, we can just store
the index value (0) just the once within the prefix table and simply reference
it from within the actual index entries. This is why index compression can save
us storage, storing something once within a leaf block rather than multiple
times.
If we now look at a partial block dump of another index leaf block
within the index, that consists of many
differing (basically unique) index entries:
C:\Users\179818>sqlplus rajesh/oracle@pdb1
SQL*Plus: Release 12.1.0.2.0 Production on Sat Mar 21 11:25:55
2015
Copyright (c) 1982, 2014, Oracle. All rights reserved.
Last Successful login time: Sat Mar 21 2015 11:22:46 +05:30
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 -
64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real
Application Testing options
rajesh@PDB1>
rajesh@PDB1>
rajesh@PDB1> set timing off
rajesh@PDB1>
rajesh@PDB1> column header_file new_val hf
rajesh@PDB1> column header_block new_val hb
rajesh@PDB1> select file_id as header_file , block_id +3 as
header_block
2 from dba_extents
3 where segment_name ='T_IDX'
4 and owner = user
5 and extent_id = 29 ;
HEADER_FILE HEADER_BLOCK
----------- ------------
12 36867
1 row selected.
rajesh@PDB1>
rajesh@PDB1> @tkfilename.sql
D:\APP\179818\diag\rdbms\cdb1\cdb1\trace\cdb1_ora_11688.trc
rajesh@PDB1> alter system dump datafile &hf block &hb
;
old 1: alter system dump
datafile &hf block &hb
new 1: alter system dump
datafile 12 block 36867
System altered.
rajesh@PDB1> exit
Disconnected from Oracle Database 12c Enterprise Edition Release
12.1.0.2.0 - 64bit Prod
With the Partitioning, OLAP, Advanced Analytics and Real
Application Testing options
C:\Users\179818>
Leaf block dump
===============
header address 462979684=0x1b988264
kdxcolev 0
KDXCOLEV Flags = - - -
kdxcolok 0
kdxcoopc 0xa0: opcode=0: iot flags=-C- is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 449
kdxcofbo 938=0x3aa
kdxcofeo 1754=0x6da
kdxcoavs 816
kdxlespl 0
kdxlende 0
kdxlenxt 50368516=0x3009004
kdxleprv 50368514=0x3009002
kdxledsz 0
kdxlebksz 8036
kdxlepnro 0
kdxlepnco 0 (Adaptive)
row#0[8022] flag: -------, lock: 0, len=14
col 0; len 4; (4):
c3 57 36 32
col 1; len 6; (6): 03 00
7c 0a 00 40
row#1[8008] flag: -------, lock: 0, len=14
col 0; len 4; (4):
c3 57 36 33
col 1; len 6; (6): 03 00
7c 0a 00 41
row#2[7994] flag: -------, lock: 0, len=14
col 0; len 4; (4):
c3 57 36 34
col 1; len 6; (6): 03 00
7c 0a 00 42
row#3[7980] flag: -------, lock: 0, len=14
col 0; len 4; (4):
c3 57 36 35
col 1; len 6; (6): 03 00
7c 0a 00 43
row#4[7966] flag: -------, lock: 0, len=14
col 0; len 4; (4):
c3 57 36 36
col 1; len 6; (6): 03 00
7c 0a 00 44
. . . . . . . .
. . . . . . . .
row#447[1768] flag: -------, lock: 0, len=14
col 0; len 4; (4):
c3 57 3a 61
col 1; len 6; (6): 03 00
7d 03 00 20
row#448[1754] flag: -------, lock: 0, len=14
col 0; len 4; (4):
c3 57 3a 62
col 1; len 6; (6): 03 00
7d 03 00 21
----- end of leaf block Logical dump -----
----- end of leaf block dump -----
End dump data blocks tsn: 4 file#: 12 minblk 36867 maxblk 36867
We notice that in the red section, both kdxlepnro
0 and kdxlepnco 0 (Adaptive) have a value of 0, meaning we have no rows
and no columns within the prefix table. As such, we have no prefix table at all
here and that this leaf block has simply not been compressed.
If we look at the actual index entries, they all have an additional
column now in blue, that being the actual
indexed value as all the index values in this leaf block are different from
each other. Without some form of index entry duplication, there would be no
benefit from compression and Index Advanced Compression has automatically
determined this and not bothered to compress this leaf block. An attempt to
compress this block would have actually increased the necessary overall storage
for these index entries, due to the additional overheads associated with the
prefix table (note it has an additional 2 byes of overhead per row within the
prefix table).