Thursday, February 25, 2010

Secret Sauce

A frequently misunderstood concept with NULLS and indexes is the myth that NULL values are not tracked in NORMAL indexes such as a conventional B*Tree index. Therefore, many people conclude erroneously that WHERE COLUMN IS NULL will not use, in fact cannot, use an index.

The facts are:

    * Entirely NULL keys are not entered into a NORMAL B*Tree in Oracle

    * Therefore, if you have a concatenated index on say C1 and C2, then you will likely find NULL values in it – since you could have a row Where C1 is NULL but C2 is     NOT NULL – that key value will be in the index.

scott@10G> create table T
  2  as select *
  3  from dba_objects
  4  /

Table created.

scott@10G> create index t_ind on t(object_id);

Index created.

scott@10G> exec dbms_stats.gather_table_stats(ownname=>USER,tabname=>'T',cascade=>TRUE);

PL/SQL procedure successfully completed.

scott@10G> set autotrace traceonly explain;
scott@10G> select * from t
  2  where object_id is null;

Execution Plan
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    93 |   165   (2)| 00:00:02 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |    93 |   165   (2)| 00:00:02 |
--------------------------------------------------------------------------

As long as you have some not null column in your set of indexed columns, you will find NULLs in the index.
Additionally – you always have a NOT NULL (virtual) column to use no matter what! (Now see the Effect of Secret Sauce )


scott@10G> drop index t_ind;

Index dropped.

scott@10G> create index t_ind on t(object_id,0);

Index created.

scott@10G> exec dbms_stats.gather_table_stats(ownname=>USER,tabname=>'T',cascade=>TRUE);

PL/SQL procedure successfully completed.

scott@10G> select * from t
  2  where object_id is null;

  Execution Plan
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    95 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     |     1 |    95 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T_IND |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------


Now we scanned the Index segment to answer our query, which is far better that scanning the entire Table segment.


The "secret sauce" here is to create what is known as a function based index
We simply added "0" to the index – we can search on OBJECT_ID as before, plus find all records where "OBJECT_ID IS NULL" without changing a thing. We can use the fact that entirely null entries are not made in B*Tree indexes as a positive thing.

No comments:

Post a Comment