Tuesday, January 20, 2015

How indexes store NULL values


Learnt something new in oracle it’s all about indexing NULL values.  How Oracle goes about indexing NULLs has prompted me to show how one could go about actually determining the answer. The basic question is are NULLs treated as just another column value and grouped accordingly or does Oracle have to somehow search through all the leaf blocks looking for all occurrences of these mysterious NULLs

The answer is that NULLs are basically considered to be potentially the largest value possible by Oracle and so are all grouped and sorted together at the “end” of the index structure (assuming the column is the leading column in the concatenated index, else they’ll be listed last for each distinct column that precedes it in the index).

The first obvious thing to check would be to create a little table and associated index with a few rows and a few NULL column values thrown in and see the results of a SELECT … ORDER BY.

The next thing to check would possibly be to use the DUMP function to again see what Oracle is likely to do with NULL values.

The best place to check of course is within the actual index itself. By determining the actual block that stores our example index, we can perform an index block dump and look at the resultant trace file that describes a representation of the index block to see precisely how Oracle deals with NULLs within indexes.


rajesh@ORA11G> create table t(x varchar2(1),y varchar2(1)); 

Table created. 

rajesh@ORA11G> insert into t values('A','A'); 

1 row created. 

rajesh@ORA11G> insert into t values('B','B'); 

1 row created. 

rajesh@ORA11G> insert into t values('C',NULL); 

1 row created. 

rajesh@ORA11G> insert into t values(NULL,'D'); 

1 row created. 

rajesh@ORA11G> insert into t values(NULL,NULL); 

1 row created. 

rajesh@ORA11G> commit; 

Commit complete. 

rajesh@ORA11G> select * from t order by x,y; 

X Y
- -
A A
B B
C
  D 

5 rows selected. 

rajesh@ORA11G>
### Note how the NULL values are logically ordered last in the above output. 
rajesh@ORA11G> column dump_x format a20
rajesh@ORA11G> column dump_y format a20
rajesh@ORA11G> select x, dump(x) dump_x,
  2     y, dump(y) dump_y
  3  from t
  4  order by x,y; 
X DUMP_X               Y DUMP_Y
- -------------------- - --------------------
A Typ=1 Len=1: 65      A Typ=1 Len=1: 65
B Typ=1 Len=1: 66      B Typ=1 Len=1: 66
C Typ=1 Len=1: 67        NULL
  NULL                 D Typ=1 Len=1: 68
  NULL                   NULL 
 
5 rows selected.
 
rajesh@ORA11G>
### Note the character “A” is represented by 65 and “B” by 66 etc
 
rajesh@ORA11G> create index t_idx on t(x,y); 
 
Index created.
rajesh@ORA11G>
rajesh@ORA11G> alter system flush buffer_cache; 
 
System altered. 
rajesh@ORA11G> column header_file new_val hf
rajesh@ORA11G> column header_block new_val hb
rajesh@ORA11G> select header_file, header_block+1 header_block
  2  from dba_segments
  3  where segment_name ='T_IDX' ; 
 
HEADER_FILE HEADER_BLOCK
----------- ------------
          6        19163 
 
1 row selected. 
 
rajesh@ORA11G> @tkfilename.sql
D:\APP\179818\diag\rdbms\ora11g\ora11g\trace\ora11g_ora_5528.trc 
 
rajesh@ORA11G> alter system dump datafile &hf block &hb;
old   1: alter system dump datafile &hf block &hb
new   1: alter system dump datafile          6 block      19163 
 
System altered. 
 
rajesh@ORA11G> exit 
### Block dump follows
Block header dump:  0x01804adb
 Object id on Block? Y
 seg/obj: 0x17c61  csc: 0x507.9b70310b  itc: 2  flg: E  typ: 2 - INDEX
     brn: 0  bdba: 0x1804ad8 ver: 0x01 opc: 0
     inc: 0  exflg: 0 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
0x02   0xffff.000.00000000  0x00000000.0000.00  C---    0  scn 0x0507.9b70310b
Leaf block dump
===============
header address 370311780=0x16128264
kdxcolev 0
KDXCOLEV Flags = - - -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y
kdxconco 3
kdxcosdc 0
kdxconro 4
kdxcofbo 44=0x2c
kdxcofeo 7986=0x1f32
kdxcoavs 7942
kdxlespl 0
kdxlende 0
kdxlenxt 0=0x0
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[8023] flag: ------, lock: 0, len=13       <<=== 1st index row
col 0; len 1; (1):  41     <<=== this is HEX ASCII value of column X has value of A
col 1; len 1; (1):  41     <<=== this is HEX ASCII value of column Y has value of A
col 2; len 6; (6):  01 80 07 cd 00 00  <<=== this represents the ROWID
row#1[8010] flag: ------, lock: 0, len=13       <<=== 2nd index row
col 0; len 1; (1):  42     <<=== this is HEX ASCII value of column X has value of B
col 1; len 1; (1):  42     <<=== this is HEX ASCII value of column Y has value of B
col 2; len 6; (6):  01 80 07 cd 00 01 <<=== this represents the ROWID
row#2[7998] flag: ------, lock: 0, len=12       <<=== 3rd index row
col 0; len 1; (1):  43     <<=== this is HEX ASCII value of column X has value of C
col 1; NULL                <<=== this is HEX ASCII value of column Y has value of NULL  
col 2; len 6; (6):  01 80 07 cd 00 02 <<=== this represents the ROWID
row#3[7986] flag: ------, lock: 0, len=12       <<=== 4th index row
col 0; NULL                <<=== this is HEX ASCII value of column X has value of NULL
col 1; len 1; (1):  44     <<=== this is HEX ASCII value of column Y has value of D
col 2; len 6; (6):  01 80 07 cd 00 03 <<=== this represents the ROWID
 
----- end of leaf block dump -----
End dump data blocks tsn: 7 file#: 6 minblk 19163 maxblk 19163
 
*** Note there are only 4 index entries for the 5 rows. Both rows with both X and Y set to NULL is NOT indexed as expected.

No comments:

Post a Comment