Thursday, August 8, 2013

Abusing functions

Learnt something new in oracle its all about a very common scenario, using function calls on database columns in predicates.

rajesh@ORA11GR2> create table t
  2  as
  3  select * from all_objects;

Table created.

Elapsed: 00:00:14.83
rajesh@ORA11GR2>
rajesh@ORA11GR2> begin
  2     dbms_stats.gather_table_stats
  3     (user,'T',
  4     method_opt=>'for all columns size 254');
  5  end;
  6  /

PL/SQL procedure successfully completed.

Well we started with a table T just a copy of all_objects and gathered histograms on all columns.

rajesh@ORA11GR2> select count(*) as total_cnt,
  2     count(*) * 0.01 as per_cnt,
  3     count( case when created >= to_date('30-oct-2011','dd-mon-yyyy')
  4             and created < to_date('30-oct-2011','dd-mon-yyyy') +1
  5             then 1 end ) per_day_cnt
  6  from t ;
 TOTAL_CNT    PER_CNT PER_DAY_CNT
---------- ---------- -----------
     72595     725.95       71874
1 row selected.

now, the total num of rows in table T is 72595 and one percent of it corresponds to 725 and nearly most of the records in table T belongs to 30-OCT-2011.

When we run queries to pull data for 30-OCT-2011 a very common implementation is                  trunc(created) ='30-oct-2011'. We get a completely incorrect plan ( estimated card differs from actuals)

rajesh@ORA11GR2> select * from t
  2  where trunc(created) = to_date('30-oct-2011','dd-mon-yyyy');
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   726 | 70422 |   292   (2)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| T    |   726 | 70422 |   292   (2)| 00:00:04 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(TRUNC(INTERNAL_FUNCTION("CREATED"))=TO_DATE(' 2011-10-30
              00:00:00', 'syyyy-mm-dd hh24:mi:ss'))


Now, when you remove function calls, you get proper cardinality

rajesh@ORA11GR2> select * from t
  2  where created >= to_date('30-oct-2011','dd-mon-yyyy')
  3  and created < to_date('30-oct-2011','dd-mon-yyyy') +1  ;
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 71738 |  6795K|   290   (1)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| T    | 71738 |  6795K|   290   (1)| 00:00:04 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("CREATED"<TO_DATE(' 2011-10-31 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "CREATED">=TO_DATE(' 2011-10-30 00:00:00',
              'syyyy-mm-dd hh24:mi:ss'))

So what does this mean to us, when you join with another data sets using function calls in database columns may let to sub-optimal plans. In this case the plans differs like this.

rajesh@ORA11GR2> select *
  2  from big_table b, t
  3  where b.id = t.object_id
  4  and trunc(t.created) = to_date('30-oct-2011','dd-mon-yyyy') ;
Elapsed: 00:00:00.02

Execution Plan
----------------------------------------------------------
Plan hash value: 908975253
-------------------------------------------------------------------------------------------
| Id  | Operation                    | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |            |   726 |   141K|  2470   (1)| 00:00:30 |
|   1 |  NESTED LOOPS                |            |       |       |            |          |
|   2 |   NESTED LOOPS               |            |   726 |   141K|  2470   (1)| 00:00:30 |
|*  3 |    TABLE ACCESS FULL         | T          |   726 | 70422 |   292   (2)| 00:00:04 |
|*  4 |    INDEX RANGE SCAN          | BIG_IDX_01 |     1 |       |     2   (0)| 00:00:01 |
|   5 |   TABLE ACCESS BY INDEX ROWID| BIG_TABLE  |     1 |   103 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(TRUNC(INTERNAL_FUNCTION("T"."CREATED"))=TO_DATE(' 2011-10-30
              00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   4 - access("B"."ID"="T"."OBJECT_ID")

rajesh@ORA11GR2> select *
  2  from big_table b, t
  3  where b.id = t.object_id
  4  and t.created >= to_date('30-oct-2011','dd-mon-yyyy')
  5  and t.created < to_date('30-oct-2011','dd-mon-yyyy') +1 ;
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 2514198030
----------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           | 71738 |    13M|       | 48918   (1)| 00:09:48 |
|*  1 |  HASH JOIN         |           | 71738 |    13M|  7640K| 48918   (1)| 00:09:48 |
|*  2 |   TABLE ACCESS FULL| T         | 71738 |  6795K|       |   290   (1)| 00:00:04 |
|   3 |   TABLE ACCESS FULL| BIG_TABLE |  5000K|   491M|       | 21046   (1)| 00:04:13 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("B"."ID"="T"."OBJECT_ID")
   2 - filter("T"."CREATED"<TO_DATE(' 2011-10-31 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "T"."CREATED">=TO_DATE(' 2011-10-30 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

rajesh@ORA11GR2>