THE RAJESH BLOG: January 2011

Saturday, January 29, 2011

PARTITION RANGE SUBQUERY - New Access Path in 10G

I saw this and found it interesting.

If the table with the WHERE predicate is relatively small compared to the partitioned table, and the expected reduction of records or partitions for the partitioned table is significant, Oracle will perform dynamic partition pruning using a recursive sub-query

Here is the snippet of code to demonstrate that.

rajesh@10GR2> create table sales
2 (
3     SALES_ID                ,
4     CUST_ID                 ,
5     QUANTITY_SOLD   ,
6     AMOUNT_SOLD             ,
7     CUST_DATE
8 )
9 partition by range(CUST_DATE)
10 subpartition by hash(cust_id)
11 subpartition template
12 (
13    subpartition sp1,
14    subpartition sp2,
15    subpartition sp3,
16    subpartition sp4
17 )
18 (
19    partition p_q1_2005 values less than ( to_date('01-apr-2005','dd-mon-yyyy') ),
20    partition p_q2_2005 values less than ( to_date('01-jul-2005','dd-mon-yyyy') ),
21    partition p_q3_2005 values less than ( to_date('01-oct-2005','dd-mon-yyyy') ),
22    partition p_q4_2005 values less than ( to_date('01-jan-2006','dd-mon-yyyy') ),
23    partition p_q1_2006 values less than ( to_date('01-apr-2006','dd-mon-yyyy') ),
24    partition p_q2_2006 values less than ( to_date('01-jul-2006','dd-mon-yyyy') ),
25    partition p_q3_2006 values less than ( to_date('01-oct-2006','dd-mon-yyyy') ),
26    partition p_q4_2006 values less than ( to_date('01-jan-2007','dd-mon-yyyy') ),
27    partition p_q1_2007 values less than ( to_date('01-apr-2007','dd-mon-yyyy') ),
28    partition p_q2_2007 values less than ( to_date('01-jul-2007','dd-mon-yyyy') ),
29    partition p_q3_2007 values less than ( to_date('01-oct-2007','dd-mon-yyyy') ),
30    partition p_q4_2007 values less than ( to_date('01-jan-2008','dd-mon-yyyy') ),
31    partition p_q1_2008 values less than ( to_date('01-apr-2008','dd-mon-yyyy') ),
32    partition p_q2_2008 values less than ( to_date('01-jul-2008','dd-mon-yyyy') ),
33    partition p_q3_2008 values less than ( to_date('01-oct-2008','dd-mon-yyyy') ),
34    partition p_q4_2008 values less than ( to_date('01-jan-2009','dd-mon-yyyy') ),
35    partition p_q1_2009 values less than ( to_date('01-apr-2009','dd-mon-yyyy') ),
36    partition p_q2_2009 values less than ( to_date('01-jul-2009','dd-mon-yyyy') ),
37    partition p_q3_2009 values less than ( to_date('01-oct-2009','dd-mon-yyyy') ),
38    partition p_q4_2009 values less than ( to_date('01-jan-2010','dd-mon-yyyy') ),
39    partition p_q1_2010 values less than ( to_date('01-apr-2010','dd-mon-yyyy') ),
40    partition p_q2_2010 values less than ( to_date('01-jul-2010','dd-mon-yyyy') ),
41    partition p_q3_2010 values less than ( to_date('01-oct-2010','dd-mon-yyyy') ),
42    partition p_q4_2010 values less than ( to_date('01-jan-2011','dd-mon-yyyy') ),
43    partition p_q1_2011 values less than ( to_date('01-apr-2011','dd-mon-yyyy') ),
44    partition p_q2_2011 values less than ( to_date('01-jul-2011','dd-mon-yyyy') ),
45    partition p_q3_2011 values less than ( to_date('01-oct-2011','dd-mon-yyyy') ),
46    partition p_q4_2011 values less than ( to_date('01-jan-2012','dd-mon-yyyy') )
47 )
48 nologging
49 as
50 select             SALES_ID         ,
51                     CUST_ID          ,
52                     QUANTITY_SOLD    ,
53                     AMOUNT_SOLD      ,
54                     trunc(CUST_DATE)
55 from sales_data;

Table created.

Elapsed: 00:01:50.23
rajesh@10GR2>
rajesh@10GR2> select count(*) from sales;

COUNT(*)
----------
11863049

Elapsed: 00:00:35.03
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> exec dbms_stats.gather_table_stats(user,'SALES');

PL/SQL procedure successfully completed.

Elapsed: 00:03:51.75
rajesh@10GR2>
rajesh@10GR2> create index sales_idx_01 on sales(sales_id) nologging;

Index created.

Elapsed: 00:02:29.23
rajesh@10GR2>
rajesh@10GR2> create table times
2 nologging
3 as
4 select dt, 'Q'||to_char(dt,'Q')||'_'||to_char(dt,'YYYY') as fisc_quater
5 from (
6     select to_date('01-JAN-2005','dd-mon-yyyy') + level - 1 as dt
7     from dual
8     connect by level < = 2500
9 )
10 /

Table created.

Elapsed: 00:00:00.11
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> exec dbms_stats.gather_table_stats(user,'TIMES');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.14
rajesh@10GR2>
rajesh@10GR2> set autotrace traceonly explain;
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> SELECT s.sales_id, s.amount_sold, s.quantity_sold, times.dt
2 from sales s, times
3 where times.dt = cust_date
4 and times.fisc_quater = 'Q3_2005'
5 /

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1453270267

---------------------------------------------------------------------------------
| Id | Operation                 | Name | Rows | Cost (%CPU)| Pstart| Pstop |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |       | 6230K| 12218   (3)|        |       |
|* 1 | HASH JOIN                |       | 6230K| 12218   (3)|      |       |
|* 2 |   TABLE ACCESS FULL       | TIMES |    89 |     4   (0)|   |       |
|   3 |   PARTITION RANGE SUBQUERY|       |    11M| 12125   (2)| KEY(SQ)|KEY(SQ)|
|   4 |    PARTITION HASH ALL     |       |    11M| 12125   (2)|      1 |     4 |
|   5 |     TABLE ACCESS FULL     | SALES |    11M| 12125   (2)|    KEY |   KEY |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("TIMES"."DT"="CUST_DATE")
   2 - filter("TIMES"."FISC_QUATER"='Q3_2005')

rajesh@10GR2>
rajesh@10GR2> set autotrace off;

and Tkprof show me this,

SELECT s.sales_id, s.amount_sold, s.quantity_sold, times.dt
from sales s, times
where times.dt = cust_date
and times.fisc_quater = 'Q3_2005'

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    68435     20.78      44.02      21906     114602          0    10265035
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total    68437     20.79      44.03      21906     114602          0    10265035

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100

Rows     Row Source Operation
------- ---------------------------------------------------
10265035 HASH JOIN (cr=114612 pr=21906 pw=0 time=102762307 us)
     92   TABLE ACCESS FULL TIMES (cr=10 pr=0 pw=0 time=158 us)
10265035   PARTITION RANGE SUBQUERY PARTITION: KEY(SUBQUERY) KEY(SUBQUERY) (cr=114602 pr=21906
10265035    PARTITION HASH ALL PARTITION: 1 4 (cr=114592 pr=21906 pw=0 time=51325251 us)
10265035     TABLE ACCESS FULL SALES PARTITION: KEY KEY (cr=114592 pr=21906 pw=0 time=41082

and Tkprof with Explain plan option shows me this,

SELECT s.sales_id, s.amount_sold, s.quantity_sold, times.dt
from sales s, times
where times.dt = cust_date
and times.fisc_quater = 'Q3_2005'

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    68435     20.78      44.02      21906     114602          0    10265035
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total    68437     20.79      44.03      21906     114602          0    10265035

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100 (RAJESH)

Rows     Row Source Operation
------- ---------------------------------------------------
10265035 HASH JOIN (cr=114612 pr=21906 pw=0 time=102762307 us)
     92   TABLE ACCESS FULL TIMES (cr=10 pr=0 pw=0 time=158 us)
10265035   PARTITION RANGE SUBQUERY PARTITION: KEY(SUBQUERY) KEY(SUBQUERY) (cr=114602 pr=21906
10265035    PARTITION HASH ALL PARTITION: 1 4 (cr=114592 pr=21906 pw=0 time=51325251 us)
10265035     TABLE ACCESS FULL SALES PARTITION: KEY KEY (cr=114592 pr=21906 pw=0 time=41082

Rows     Execution Plan
------- ---------------------------------------------------
      0 SELECT STATEMENT   MODE: ALL_ROWS
10265035   HASH JOIN
     92    TABLE ACCESS   MODE: ANALYZED (FULL) OF 'TIMES' (TABLE)
10265035    PARTITION RANGE (SUBQUERY) PARTITION:KEY(SUBQUERY)KEY(SUBQUERY)
              SELECT distinct TBL$OR$IDX$PART$NUM("SALES", 0, d#, p#,
              "CUST_DATE") FROM (SELECT "TIMES"."DT" "CUST_DATE" FROM
              "TIMES" "TIMES" WHERE "TIMES"."FISC_QUATER"='Q3_2005') ORDER
                BY 110265035     PARTITION HASH (ALL) PARTITION: START=1 STOP=4
10265035      TABLE ACCESS   MODE: ANALYZED (FULL) OF 'SALES' (TABLE)
                 PARTITION:KEYKEY

As you see, Oracle has dynamically constructed a co-related subquery and used its results to drive this PARTITION RANGE (SUBQUERY) access path.

SELECT distinct TBL$OR$IDX$PART$NUM("SALES", 0, 1, 0, "CUST_DATE") FROM (SELECT "TIMES"."DT" "CUST_DATE" FROM "TIMES" "TIMES" WHERE
"TIMES"."FISC_QUATER"='Q3_2005') ORDER BY 1

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0         10          0           1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total        4      0.01       0.01          0         10          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100 (RAJESH)   (recursive depth: 1)

Rows     Row Source Operation
------- ---------------------------------------------------
      1 SORT UNIQUE (cr=10 pr=0 pw=0 time=329 us)
     92   TABLE ACCESS FULL TIMES (cr=10 pr=0 pw=0 time=152 us)

Rows     Execution Plan
------- ---------------------------------------------------
      0 SELECT STATEMENT   MODE: ALL_ROWS
      1   SORT (UNIQUE)
     92    TABLE ACCESS   MODE: ANALYZED (FULL) OF 'TIMES' (TABLE)

I was wondering why should oracle use HASH JOIN since one of the data source in plan (table TIMES having 84 rows) is providing a low cardinality. Instead use NESTED LOOPS ( pick one table having low cardinality and iterate its rows over the other table and find matching rows )

Now, Forcing Nested Loops hint in query plan changes like this.

rajesh@10GR2> SELECT /*+ use_nl(s,t) */ s.sales_id, s.amount_sold, s.quantity_sold, t.dt
2 from sales s, times t
3 where t.dt = s.cust_date
4 and t.fisc_quater = 'Q3_2005'
5 /

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1526662114
---------------------------------------------------------------------------------
| Id | Operation                 | Name | Rows | Cost (%CPU)| Pstart| Pstop |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |       | 6230K| 38540   (2)|        |       |
|   1 | NESTED LOOPS             |       | 6230K| 38540   (2)|        |       |
|* 2 |   TABLE ACCESS FULL       | TIMES |    89 |     4   (0)|        |       |
|   3 |   PARTITION RANGE ITERATOR|       | 69786 |   433   (2)|    KEY |   KEY |
|   4 |    PARTITION HASH ALL     |       | 69786 |   433   (2)|      1 |     4 |
|* 5 |     TABLE ACCESS FULL     | SALES | 69786 |   433   (2)|    KEY |   KEY |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T"."FISC_QUATER"='Q3_2005')
   5 - filter("T"."DT"="S"."CUST_DATE")

rajesh@10GR2>

Now, the access path changes to PARTITION RANGE ITERATOR and its no more PARTITION RANGE SUBQUERY. Tkprof shows me this,

SELECT /*+ use_nl(s,t) */ s.sales_id, s.amount_sold, s.quantity_sold, t.dt
from sales s, times t
where t.dt = s.cust_date
and t.fisc_quater = 'Q3_2005'

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.01       0.00          0          0          2           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    68435    189.48    3799.59    3289938    4344283          0    10265035
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total    68437    189.50    3799.60    3289938    4344283          2    10265035

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100 (RAJESH)

Thats a Huge difference ! using HASH join it took 44 sec but using NESTED LOOPS it took 3800 secs (almost 1 hrs, 3 mins).

Replicated these table's ( SALES, TIMES) to 9iR2 database and I dont see this PARTITION RANGE SUBQUERY happening there. Its nowhere available in 9i product documentation.

rajesh@9IR2> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.8.0 - Production
PL/SQL Release 9.2.0.8.0 - Production
CORE    9.2.0.8.0       Production
TNS for 32-bit Windows: Version 9.2.0.8.0 - Production
NLSRTL Version 9.2.0.8.0 - Production

Elapsed: 00:00:00.01
rajesh@9IR2>
rajesh@9IR2> SET AUTOTRACE TRACEONLY EXPLAIN;
rajesh@9IR2>
rajesh@9IR2> SELECT s.sales_id, s.quantity_sold, s.amount_sold, t.dt
2 from sales s, times t
3 where s.cust_date = t.dt
4 and t.fisc_quater = 'Q3_2005'
5 /

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5407 Card=3143029 Bytes=122578131)
   1    0   HASH JOIN (Cost=5407 Card=3143029 Bytes=122578131)
   2    1     TABLE ACCESS (FULL) OF 'TIMES' (Cost=3 Card=89 Bytes=1424)
   3    1     PARTITION RANGE (ITERATOR)   4    3       PARTITION HASH (ALL)
   5    4         TABLE ACCESS (FULL) OF 'SALES' (Cost=5234 Card=11863049 Bytes=272850127)

rajesh@9IR2>
rajesh@9IR2> SET AUTOTRACE OFF;
rajesh@9IR2>

And Tkprof shows me this,

SELECT s.sales_id, s.quantity_sold, s.amount_sold, t.dt
from sales s, times t
where s.cust_date = t.dt
and t.fisc_quater = 'Q3_2005'

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    68435     12.43      45.66      45959     114727          0    10265035
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total    68437     12.43      45.66      45959     114727          0    10265035

Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: 107

Rows     Row Source Operation
------- ---------------------------------------------------
10265035 HASH JOIN
     92   TABLE ACCESS FULL TIMES
10265035   PARTITION RANGE ITERATOR PARTITION: KEY KEY
10265035    PARTITION HASH ALL PARTITION: 1 4
10265035     TABLE ACCESS FULL SALES PARTITION: KEY KEY

Now, I am wondering really why this PARTITION RANGE SUBQUERY access path is introduced in 10g database still Oracle 9i can do that efficiently without having PARTITION RANGE SUBQUERY in place.

Wednesday, January 26, 2011

Full Partition Wise Joins

A full partition-wise join divides a large join into smaller joins between a pair of partitions from the two joined tables. To use this feature, you must equipartition both tables on their join keys. For example, consider a large join between a sales table and a customer table on the column cust_id, the query " find the sales details for all customers who are effective ( having cust_date in Q3 2005 ) in Q3 2005 " is a typical example of a SQL Statement performing such a join is

SELECT c.* , s.*
from customers c, sales s
where c.cust_id = s.cust_id
and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
and to_date('30-sep-2005','dd-mon-yyyy');

If we equipartition both customers and sales table based on this join keys like below.

rajesh@10GR2> create table customers
2 (
3     CUST_ID                                                 ,
4     CUST_FIRST_NAME                                 ,
5     CUST_LAST_NAME                                  ,
6     CUST_GENDER                                             ,
7     CUST_DATE                                               ,
8     CUST_POSTAL_CODE                                ,
9     CUST_CITY                                               ,
10     CUST_TOTAL
11 )
12 partition by range(CUST_DATE)
13 subpartition by hash(cust_id)
14 subpartition template
15 (
16     subpartition sp1,
17     subpartition sp2,
18     subpartition sp3,
19     subpartition sp4
20 )
21 (
22     partition p_q1_2005 values less than ( to_date('01-apr-2005','dd-mon-yyyy') ),
23     partition p_q2_2005 values less than ( to_date('01-jul-2005','dd-mon-yyyy') ),
24     partition p_q3_2005 values less than ( to_date('01-oct-2005','dd-mon-yyyy') ),
25     partition p_q4_2005 values less than ( to_date('01-jan-2006','dd-mon-yyyy') ),
26     partition p_q1_2006 values less than ( to_date('01-apr-2006','dd-mon-yyyy') ),
27     partition p_q2_2006 values less than ( to_date('01-jul-2006','dd-mon-yyyy') ),
28     partition p_q3_2006 values less than ( to_date('01-oct-2006','dd-mon-yyyy') ),
29     partition p_q4_2006 values less than ( to_date('01-jan-2007','dd-mon-yyyy') ),
30     partition p_q1_2007 values less than ( to_date('01-apr-2007','dd-mon-yyyy') ),
31     partition p_q2_2007 values less than ( to_date('01-jul-2007','dd-mon-yyyy') ),
32     partition p_q3_2007 values less than ( to_date('01-oct-2007','dd-mon-yyyy') ),
33     partition p_q4_2007 values less than ( to_date('01-jan-2008','dd-mon-yyyy') ),
34     partition p_q1_2008 values less than ( to_date('01-apr-2008','dd-mon-yyyy') ),
35     partition p_q2_2008 values less than ( to_date('01-jul-2008','dd-mon-yyyy') ),
36     partition p_q3_2008 values less than ( to_date('01-oct-2008','dd-mon-yyyy') ),
37     partition p_q4_2008 values less than ( to_date('01-jan-2009','dd-mon-yyyy') ),
38     partition p_q1_2009 values less than ( to_date('01-apr-2009','dd-mon-yyyy') ),
39     partition p_q2_2009 values less than ( to_date('01-jul-2009','dd-mon-yyyy') ),
40     partition p_q3_2009 values less than ( to_date('01-oct-2009','dd-mon-yyyy') ),
41     partition p_q4_2009 values less than ( to_date('01-jan-2010','dd-mon-yyyy') ),
42     partition p_q1_2010 values less than ( to_date('01-apr-2010','dd-mon-yyyy') ),
43     partition p_q2_2010 values less than ( to_date('01-jul-2010','dd-mon-yyyy') ),
44     partition p_q3_2010 values less than ( to_date('01-oct-2010','dd-mon-yyyy') ),
45     partition p_q4_2010 values less than ( to_date('01-jan-2011','dd-mon-yyyy') ),
46     partition p_q1_2011 values less than ( to_date('01-apr-2011','dd-mon-yyyy') ),
47     partition p_q2_2011 values less than ( to_date('01-jul-2011','dd-mon-yyyy') ),
48     partition p_q3_2011 values less than ( to_date('01-oct-2011','dd-mon-yyyy') ),
49     partition p_q4_2011 values less than ( to_date('01-jan-2012','dd-mon-yyyy') )
50 )
51 nologging
52 as
53 select     CUST_ID                 ,
54             CUST_FIRST_NAME ,
55             CUST_LAST_NAME ,
56             CUST_GENDER             ,
57             CUST_DATE               ,
58             CUST_POSTAL_CODE ,
59             CUST_CITY               ,
60             CUST_TOTAL
61 from customers_data;

Table created.

Elapsed: 00:00:16.24
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> exec dbms_stats.gather_table_stats(user,'CUSTOMERS');

PL/SQL procedure successfully completed.

Elapsed: 00:00:08.64
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> create table sales
2 (
3     SALES_ID                ,
4     CUST_ID                 ,
5     QUANTITY_SOLD   ,
6     AMOUNT_SOLD             ,
7     CUST_DATE
8 )
9 partition by range(CUST_DATE)
10 subpartition by hash(cust_id)
11 subpartition template
12 (
13     subpartition sp1,
14     subpartition sp2,
15     subpartition sp3,
16     subpartition sp4
17 )
18 (
19     partition p_q1_2005 values less than ( to_date('01-apr-2005','dd-mon-yyyy') ),
20     partition p_q2_2005 values less than ( to_date('01-jul-2005','dd-mon-yyyy') ),
21     partition p_q3_2005 values less than ( to_date('01-oct-2005','dd-mon-yyyy') ),
22     partition p_q4_2005 values less than ( to_date('01-jan-2006','dd-mon-yyyy') ),
23     partition p_q1_2006 values less than ( to_date('01-apr-2006','dd-mon-yyyy') ),
24     partition p_q2_2006 values less than ( to_date('01-jul-2006','dd-mon-yyyy') ),
25     partition p_q3_2006 values less than ( to_date('01-oct-2006','dd-mon-yyyy') ),
26     partition p_q4_2006 values less than ( to_date('01-jan-2007','dd-mon-yyyy') ),
27     partition p_q1_2007 values less than ( to_date('01-apr-2007','dd-mon-yyyy') ),
28     partition p_q2_2007 values less than ( to_date('01-jul-2007','dd-mon-yyyy') ),
29     partition p_q3_2007 values less than ( to_date('01-oct-2007','dd-mon-yyyy') ),
30     partition p_q4_2007 values less than ( to_date('01-jan-2008','dd-mon-yyyy') ),
31     partition p_q1_2008 values less than ( to_date('01-apr-2008','dd-mon-yyyy') ),
32     partition p_q2_2008 values less than ( to_date('01-jul-2008','dd-mon-yyyy') ),
33     partition p_q3_2008 values less than ( to_date('01-oct-2008','dd-mon-yyyy') ),
34     partition p_q4_2008 values less than ( to_date('01-jan-2009','dd-mon-yyyy') ),
35     partition p_q1_2009 values less than ( to_date('01-apr-2009','dd-mon-yyyy') ),
36     partition p_q2_2009 values less than ( to_date('01-jul-2009','dd-mon-yyyy') ),
37     partition p_q3_2009 values less than ( to_date('01-oct-2009','dd-mon-yyyy') ),
38     partition p_q4_2009 values less than ( to_date('01-jan-2010','dd-mon-yyyy') ),
39     partition p_q1_2010 values less than ( to_date('01-apr-2010','dd-mon-yyyy') ),
40     partition p_q2_2010 values less than ( to_date('01-jul-2010','dd-mon-yyyy') ),
41     partition p_q3_2010 values less than ( to_date('01-oct-2010','dd-mon-yyyy') ),
42     partition p_q4_2010 values less than ( to_date('01-jan-2011','dd-mon-yyyy') ),
43     partition p_q1_2011 values less than ( to_date('01-apr-2011','dd-mon-yyyy') ),
44     partition p_q2_2011 values less than ( to_date('01-jul-2011','dd-mon-yyyy') ),
45     partition p_q3_2011 values less than ( to_date('01-oct-2011','dd-mon-yyyy') ),
46     partition p_q4_2011 values less than ( to_date('01-jan-2012','dd-mon-yyyy') )
47 )
48 nologging
49 as
50 select     SALES_ID                ,
51             CUST_ID                 ,
52             QUANTITY_SOLD   ,
53             AMOUNT_SOLD             ,
54             CUST_DATE
55 from sales_data;

Table created.

Elapsed: 00:07:03.90
rajesh@10GR2> alter table customers add constraint cust_pk primary key(cust_id);

Table altered.

Elapsed: 00:00:11.01
rajesh@10GR2> alter table sales add constraint sales_fk foreign key(cust_id) references customers;

Table altered.

Elapsed: 00:00:19.17
rajesh@10GR2> select count(*) from sales;

COUNT(*)
----------
11863049

Elapsed: 00:01:12.48
rajesh@10GR2> select count(*) from customers;

COUNT(*)
----------
   1026072

Elapsed: 00:00:00.71
rajesh@10GR2>

rajesh@10GR2> SELECT c.* , s.*
2 from customers c, sales s
3 where c.cust_id = s.cust_id
4 and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
5 and to_date('30-sep-2005','dd-mon-yyyy');

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 3704956898

-----------------------------------------------------------------------------------
| Id | Operation                | Name      | Rows | Cost (%CPU)| Pstart| Pstop |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |           |    11M| 39198   (2)|       |       |
|   1 | PARTITION HASH ALL      |           |    11M| 39198   (2)|     1 |     4 |
|* 2 |   HASH JOIN              |           |    11M| 39198   (2)|       |       |
|   3 |    PARTITION RANGE SINGLE|           |   929K| 1875   (2)|     3 |     3 |
|* 4 |     TABLE ACCESS FULL    | CUSTOMERS |   929K| 1875   (2)|     9 |    12 |
|   5 |    PARTITION RANGE ALL   |           |    11M| 12125   (2)|     1 |    28 |
|   6 |     TABLE ACCESS FULL    | SALES     |    11M| 12125   (2)|     1 |   112 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C"."CUST_ID"="S"."CUST_ID")
   4 - filter("C"."CUST_DATE"<=TO_DATE('2005-09-30 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))

rajesh@10GR2>

All HASH Partition in SALES table is scanned (this is shown in plan as Pstart = 1 and Pstop = 112 at Id=6)
Only 4 Hash Partition in CUSTOMERS table corresponding to 2005, Q3 is scanned (this is shown in plan as Pstart = 9 and Pstop = 12 at Id=4)
the Access path PARTITION HASH ALL from plan says, only the 4 Hash Partition from CUSTOMERS table is joined (Full Partition-Wise Joins) with 4 Hash Partition from sales table (this is shown in plan as Pstart = 1 and Pstop = 4 at Id=1)

And, this is what Tkprof shows,

SELECT c.* , s.*
from customers c, sales s
where c.cust_id = s.cust_id
and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
and to_date('30-sep-2005','dd-mon-yyyy')

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    68435     52.31      83.57      47086     130237          0    10265035
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total    68437     52.31      83.59      47086     130237          0    10265035

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100

Rows     Row Source Operation
------- ---------------------------------------------------
10265035 PARTITION HASH ALL PARTITION: 1 4 (cr=130237 pr=47086 pw=744 time=64025850 us)
10265035   HASH JOIN (cr=130237 pr=47086 pw=744 time=54649725 us)
884070    PARTITION RANGE SINGLE PARTITION: 3 3 (cr=8054 pr=0 pw=0 time=1768764 us)
884070     TABLE ACCESS FULL CUSTOMERS PARTITION: 9 12 (cr=8054 pr=0 pw=0 time=884626 us)
11863049    PARTITION RANGE ALL PARTITION: 1 28 (cr=122183 pr=46342 pw=0 time=71307400 us)
11863049     TABLE ACCESS FULL SALES PARTITION: 1 112 (cr=122183 pr=46342 pw=0 time=5957447

( Awesome, it processed 10M records in 83 secs )

Now, just changing to query to "find the sales details for all customers who made Transactions in Q3 2005 ", causes the plan to change like this.

rajesh@10GR2> SELECT c.* , s.*
2 from customers c, sales s
3 where c.cust_id = s.cust_id
4 and   c.cust_date = s.cust_date
5 and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
6 and to_date('30-sep-2005','dd-mon-yyyy')
7 /
Elapsed: 00:00:00.03

Execution Plan
----------------------------------------------------------
Plan hash value: 3606662981

-----------------------------------------------------------------------------------
| Id | Operation                | Name      | Rows | Cost (%CPU)| Pstart| Pstop |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |           |   801K| 34547   (2)|       |       |
|   1 | PARTITION HASH ALL      |           |   801K| 34547   (2)|     1 |     4 ||* 2 |   HASH JOIN              |           |   801K| 34547   (2)|       |       |
|   3 |    PARTITION RANGE SINGLE|           |   929K| 1875   (2)|     3 |     3 |
|* 4 |     TABLE ACCESS FULL    | CUSTOMERS |   929K| 1875   (2)|     9 |    12 |
|   5 |    PARTITION RANGE SINGLE|           |    10M| 10510   (3)|     3 |     3 |
|* 6 |     TABLE ACCESS FULL    | SALES     |    10M| 10510   (3)|     9 |    12 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C"."CUST_ID"="S"."CUST_ID" AND "C"."CUST_DATE"="S"."CUST_DATE")
   4 - filter("C"."CUST_DATE"<=TO_DATE('2005-09-30 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
   6 - filter("S"."CUST_DATE"<=TO_DATE('2005-09-30 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))

rajesh@10GR2>

And the Tkprof show me,

SELECT c.* , s.*
from customers c, sales s
where c.cust_id = s.cust_id
and   c.cust_date = s.cust_date
and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
and to_date('30-sep-2005','dd-mon-yyyy')

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch     7205     14.59      63.64      46889      61643          0     1080530
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total     7207     14.59      63.65      46889      61643          0     1080530

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100

Rows     Row Source Operation
------- ---------------------------------------------------
1080530 PARTITION HASH ALL PARTITION: 1 4 (cr=61643 pr=46889 pw=775 time=13521195 us)
1080530   HASH JOIN (cr=61643 pr=46889 pw=775 time=13366665 us)
884070    PARTITION RANGE SINGLE PARTITION: 3 3 (cr=7992 pr=0 pw=0 time=1768864 us)
884070     TABLE ACCESS FULL CUSTOMERS PARTITION: 9 12 (cr=7992 pr=0 pw=0 time=884703 us)
10265035    PARTITION RANGE SINGLE PARTITION: 3 3 (cr=53651 pr=46114 pw=0 time=61625325 us)
10265035     TABLE ACCESS FULL SALES PARTITION: 9 12 (cr=53651 pr=46114 pw=0 time=41095102

Full partition-wise join will takes place only if you, equipartition both tables on their join keys. So, what will happen if you don't equipartition both tables.

scott@10GR2> create table customers
2 (
3     CUST_ID                                                 ,
4     CUST_FIRST_NAME                                 ,
5     CUST_LAST_NAME                                  ,
6     CUST_GENDER                                             ,
7     CUST_DATE                                               ,
8     CUST_POSTAL_CODE                                ,
9     CUST_CITY                                               ,
10     CUST_TOTAL
11 )
12 partition by hash(cust_id) partitions 8
13 nologging
14 as
15 select     CUST_ID                 ,
16             CUST_FIRST_NAME ,
17             CUST_LAST_NAME ,
18             CUST_GENDER             ,
19             CUST_DATE               ,
20             CUST_POSTAL_CODE ,
21             CUST_CITY               ,
22             CUST_TOTAL
23 from customers_data
24 where 1=0;

Table created.

Elapsed: 00:00:00.25
scott@10GR2>
scott@10GR2>
scott@10GR2> create table sales
2 (
3     SALES_ID                ,
4     CUST_ID                 ,
5     QUANTITY_SOLD   ,
6     AMOUNT_SOLD             ,
7     CUST_DATE
8 )
9 partition by hash(SALES_ID) partitions 8
10 nologging
11 as
12 select     SALES_ID                ,
13             CUST_ID                 ,
14             QUANTITY_SOLD   ,
15             AMOUNT_SOLD             ,
16             CUST_DATE
17 from sales_data
18 where 1=0;

Table created.

Elapsed: 00:00:00.21
scott@10GR2>

Now, Customer table is HASH partitioned on customer_id and Sales table is HASH partitioned on sales_id columns and now the Plan changes like below.

scott@10GR2>
scott@10GR2> begin
2     dbms_stats.set_table_stats(
3             ownname=>user,
4             tabname=>'CUSTOMERS',
5             numrows => 1026072,
6             numblks => 1000000 );
7
8     dbms_stats.set_table_stats(
9             ownname=>user,
10             tabname=>'SALES',
11             numrows => 11863049,
12             numblks => 1000000 );
13 end;
14 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.10
scott@10GR2>
scott@10GR2> set autotrace traceonly explain;
scott@10GR2>
scott@10GR2> SELECT c.* , s.*
2 from customers c, sales s
3 where c.cust_id = s.cust_id
4 and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
5 and to_date('30-sep-2005','dd-mon-yyyy');

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 742433711

------------------------------------------------------------------------------
| Id | Operation           | Name      | Rows | Cost (%CPU)| Pstart| Pstop |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |           | 29658 |   438K (1)|       |       |
|* 1 | HASH JOIN          |           | 29658 |   438K (1)|       |       |
|   2 |   PARTITION HASH ALL|           | 2565 |   219K (1)|     1 |     8 |
|* 3 |    TABLE ACCESS FULL| CUSTOMERS | 2565 |   219K (1)|     1 |     8 |
|   4 |   PARTITION HASH ALL|           |    11M|   219K (1)|     1 |     8 |
|   5 |    TABLE ACCESS FULL| SALES     |    11M|   219K (1)|     1 |     8 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("C"."CUST_ID"="S"."CUST_ID")
   3 - filter("C"."CUST_DATE">=TO_DATE('2005-07-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss')
              AND "C"."CUST_DATE"<=TO_DATE('2005-09-30 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))

scott@10GR2>

You can see the Full partitioned wise join doesn't happen now ( PARTITION HASH ALL access path missing above HASH JOIN in plans ).

These kinds of large join is typical in data warehousing environments, when data-model is designed properly Full Partition Wise Joins will really be help full.

Friday, January 21, 2011

Histograms on Date, Varchar and Numeric columns

Not, entirely learnt newly today but revisited some good old stuffs. Its about how to translate a histogram into a report of underlying datas (sometimes to see how histograms matches real datas). Lets see how this to be handled on dates, numeric and Varchar2 columns.

rajesh@10GR2>
rajesh@10GR2> create table t
2 nologging
3 as
4 select object_id,ntile(10) over(order by object_id) as ntile
5 from all_objects;

Table created.

Elapsed: 00:00:01.90
rajesh@10GR2>
rajesh@10GR2> begin
2     dbms_stats.gather_table_stats(
3             ownname =>user,
4             tabname=>'T',
5             method_opt=>'for all columns size 10',
6             estimate_percent=>100
7             );
8 end;
9 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.64
rajesh@10GR2>

Now table 'T' will have only values from 1 to 10 in the column NTILE. so provide the values SIZE 10 in gathering histograms. Now looking at user_tab_histograms will represent below values.

rajesh@10GR2> select table_name, column_name, endpoint_number, endpoint_value
2 from user_tab_histograms
3 where table_name ='T'
4 and column_name ='NTILE'
5 order by column_name,endpoint_value;

TABLE_NAME COLUMN_NAM ENDPOINT_NUMBER                        ENDPOINT_VALUE
---------- ---------- --------------- -------------------------------------
T          NTILE                 5675                                     1
T          NTILE                11349                                     2
T          NTILE                17023                                     3
T          NTILE                22697                                     4
T          NTILE                28371                                     5
T          NTILE                34045                                     6
T          NTILE                39719                                     7
T          NTILE                45393                                     8
T          NTILE                51067                                     9
T          NTILE                56741                                    10

10 rows selected.

Elapsed: 00:00:00.54
rajesh@10GR2>

The ENDPOINT_VALUE value in user_tab_histograms holds the list of values available in column NTILE. the ENDPOINT_NUMBER is cumulative frequency for appearance of each ENDPOINT_VALUE's.

rajesh@10GR2> select ntile,count(*)
2 from t
3 group by ntile;

     NTILE   COUNT(*)
---------- ----------
         1       5675
         6       5674
         2       5674
         4       5674
         5       5674
         8       5674
         3       5674
         7       5674
         9       5674
        10       5674

10 rows selected.

Elapsed: 00:00:00.28
rajesh@10GR2>
rajesh@10GR2>

This show that we have count of 5674 (approx) for each value of NTILE. lets see how we can match this using Histogram buckets (from below query).

rajesh@10GR2> select table_name, column_name, endpoint_number, endpoint_value,
2             endpoint_number - lag(endpoint_number,1,0) over(order by endpoint_value) as actual_valu
3 from user_tab_histograms
4 where table_name ='T'
5 and column_name ='NTILE'
6 order by column_name,endpoint_value;

TABLE_NAME COLUMN_NAM ENDPOINT_NUMBER ENDPOINT_VALUE ACTUAL_VALUES
---------- ---------- --------------- -------------- -------------
T          NTILE                 5675              1          5675
T          NTILE                11349              2          5674
T          NTILE                17023              3          5674
T          NTILE                22697              4          5674
T          NTILE                28371              5          5674
T          NTILE                34045              6          5674
T          NTILE                39719              7          5674
T          NTILE                45393              8          5674
T          NTILE                51067              9          5674
T          NTILE                56741             10          5674

10 rows selected.

Elapsed: 00:00:00.04
rajesh@10GR2>
rajesh@10GR2>

This code is for Numeric columns, for Date columns we need to get day part and fraction of day, multiplies the fraction by the number of seconds in a day, converts both bits to character and then uses the ‘Julian’ and ‘seconds’ conversion format.

rajesh@10GR2> create table t
2 as
3 select sysdate as dt, sysdate - ntile(10) over(order by object_id) as ntile
4 from all_objects;

Table created.

Elapsed: 00:00:02.12
rajesh@10GR2>
rajesh@10GR2> begin
2     dbms_stats.gather_table_stats(
3             ownname =>user,
4             tabname=>'T',
5             method_opt=>'for all columns size 10',
6             estimate_percent=>100
7             );
8 end;
9 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.39
rajesh@10GR2>

rajesh@10GR2>
rajesh@10GR2> select ntile,count(*)
2 from t
3 group by ntile
4 order by 1;

NTILE       COUNT(*)
--------- ----------
11-JAN-11       5674
12-JAN-11       5674
13-JAN-11       5674
14-JAN-11       5674
15-JAN-11       5674
16-JAN-11       5674
17-JAN-11       5674
18-JAN-11       5674
19-JAN-11       5674
20-JAN-11       5675

10 rows selected.

Elapsed: 00:00:00.07
rajesh@10GR2>
rajesh@10GR2>

Now, reconstructing and validating data's from Histogram buckets will be like below.

rajesh@10GR2> SELECT table_name,
2    column_name,
3    endpoint_number,
4    endpoint_number - lag(endpoint_number,1,0) over(order by endpoint_value) AS COUNT,
5    to_date(floor(endpoint_value)
6    ||'.'
7    || TO_CHAR(86400 * MOD(endpoint_value, 1), 'FM999999999'),'J.sssss')endpoint_value
8 FROM user_tab_histograms
9 WHERE table_name ='T'
10 AND column_name ='NTILE'
11 ORDER BY column_name,endpoint_value;

TABLE_NAME COLUMN_NAM ENDPOINT_NUMBER      COUNT ENDPOINT_
---------- ---------- --------------- ---------- ---------
T          NTILE                 5674       5674 11-JAN-11
T          NTILE                11348       5674 12-JAN-11
T          NTILE                17022       5674 13-JAN-11
T          NTILE                22696       5674 14-JAN-11
T          NTILE                28370       5674 15-JAN-11
T          NTILE                34044       5674 16-JAN-11
T          NTILE                39718       5674 17-JAN-11
T          NTILE                45392       5674 18-JAN-11
T          NTILE                51066       5674 19-JAN-11
T          NTILE                56741       5675 20-JAN-11

10 rows selected.

Elapsed: 00:00:00.07
rajesh@10GR2>

Now for character based columns it involves.

Take the first six bytes of the string
View this as a hexadecimal number, and convert to decimal
Round to 15 significant digits and store as the endpoint_value
If duplicate rows appear, store the first 32 bytes of each string as the endpoint_actual_value

rajesh@10GR2> create table t
2 as
3 select empno, ename as OBJECT_NAME
4 from emp
5 where rownum <= 10;

Table created.

Elapsed: 00:00:00.39
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> begin
2     dbms_stats.gather_table_stats(
3             ownname =>user,
4             tabname=>'T',
5             method_opt=>'for all columns size 10',
6             estimate_percent=>100
7             );
8 end;
9 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.14
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> SELECT endpoint_number,
2    endpoint_number - NVL(prev_endpoint,0) frequency,
3    hex_values,
4    chr(to_number(SUBSTR(hex_values, 2,2),'XX'))
5    || chr(to_number(SUBSTR(hex_values, 4,2),'XX'))
6    || chr(to_number(SUBSTR(hex_values, 6,2),'XX'))
7    || chr(to_number(SUBSTR(hex_values, 8,2),'XX'))
8    || chr(to_number(SUBSTR(hex_values,10,2),'XX'))
9    || chr(to_number(SUBSTR(hex_values,12,2),'XX'))
10    || chr(to_number(SUBSTR(hex_values,14,2),'XX'))
11    || chr(to_number(SUBSTR(hex_values,16,2),'XX'))
12    || chr(to_number(SUBSTR(hex_values,18,2),'XX'))
13    || chr(to_number(SUBSTR(hex_values,20,2),'XX'))
14    || chr(to_number(SUBSTR(hex_values,22,2),'XX'))
15    || chr(to_number(SUBSTR(hex_values,24,2),'XX'))
16    as column_values
17 FROM
18    (SELECT endpoint_number,
19      lag(endpoint_number,1) over( order by endpoint_number ) prev_endpoint,
20      TO_CHAR(endpoint_value,'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX')hex_values,
21      endpoint_actual_value
22    FROM user_tab_histograms
23    WHERE table_name = 'T'
24    AND column_name = 'OBJECT_NAME'
25    )
26 ORDER BY endpoint_number ;

ENDPOINT_NUMBER FREQUENCY HEX_VALUES                      COLUMN_VALUE
--------------- ---------- ------------------------------- ------------
              1          1 414C4C454DFFEFD17E20CECE400000 ALLEM n-~ ++
              2          1 424C414B44FFF5632009023BE00000 BLAKD )c     ?;
              3          1 434C41524B001321ED8AF356E00000 CLARK ?!fè=V
              4          1 4A4F4E4552FFEC7EDB45D5C0800000 JONER 8~¦E++
              5          1 4B494E4700001D9911EA0EFCE00000 KING ?Ö?O?n
              6          1 4D415254494DFF74618D0FA4400000 MARTIM taì¤ñ
              7          1 53434F54540016FED29C0B71200000 SCOTT ?¦-£?q
              8          1 534D49544800059420ACC8DF600000 SMITH ?ö ¼+¯
              9          1 5455524E455201F1D23C139C800000 TURNER?±-<?£
             10          1 57415243FFFFF52B667FFCC9400000 WARC  )+f¦n+

10 rows selected.

Elapsed: 00:00:00.75
rajesh@10GR2>

But due to some Hexadecimal value mismatch, I feel column  COLUMN_VALUE from the above output provides some special characters along with actual data.

Tuesday, January 11, 2011

Histograms 9i Vs 10G

Where there is a high degree of skew in the column distribution, called a non-uniform distribution of data, histograms should lead to a better estimation of selectivity. This should produce plans that are more likely to be optimal. The histogram approach provides an efficient and compact way to represent data distributions.

When building histograms the information it stores is interpreted differently depending on whether the number of buckets requested is less than the number distinct values or if it is the same.

rajesh@9IR2> create table t
2 as
3 select rownum as a,
4        case when rownum <=9 then rownum
5                     else 99 end as b
6 from dual
7 connect by level <=10000;

Table created.

Elapsed: 00:00:00.03
rajesh@9IR2>

The column 'a' has unique values from 1 to 10000.
The column 'b' has values from 1 to 9 occurs only once and '99' occurs 9991 times. (massively skewed)

select * from t where b=99;
select * from t where b=3;

Both the above queries use FULL TABLE SCAN as there is no other access path available.

rajesh@9IR2> set autotrace traceonly explain;
rajesh@9IR2> select * from t where b=99;

Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'T'

rajesh@9IR2> select * from t where b=3;

Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'T'

rajesh@9IR2>
rajesh@9IR2> set autotrace off;

lets create Index on this massively skewed column.

rajesh@9IR2> create index t_ind on t(b) nologging;

Index created.

Elapsed: 00:00:00.03
rajesh@9IR2>
rajesh@9IR2> select table_name, column_name, nullable, num_buckets, num_distinct
2 from user_tab_columns
3 where table_name ='T'
4 /

TABLE_NAME           COLUMN_NAME          N NUM_BUCKETS NUM_DISTINCT
-------------------- -------------------- - ----------- ------------
T                    A                    Y
T                    B                    Y

Elapsed: 00:00:00.01

rajesh@9IR2>

With index present query(2) should use INDEX RANGE SCAN and FULL TABLE SCAN for query(1).

rajesh@9IR2> set autotrace traceonly explain;
rajesh@9IR2> select * from t where b=99;

Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T'
   2    1     INDEX (RANGE SCAN) OF 'T_IND' (NON-UNIQUE)

rajesh@9IR2> select * from t where b=3;

Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T'
   2    1     INDEX (RANGE SCAN) OF 'T_IND' (NON-UNIQUE)

rajesh@9IR2>
rajesh@9IR2> set autotrace off;

Both do an INDEX RANGE SCAN to get the ROWID to do a lookup in the table 'T'.

Analyzing the table using DBMS_STATS API shows the following information in data dictionary.

rajesh@9IR2> exec dbms_stats.gather_table_stats(user,'T');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
rajesh@9IR2>
rajesh@9IR2> select table_name, column_name, nullable, num_buckets, num_distinct
2 from user_tab_columns
3 where table_name ='T'
4 /

TABLE_NAME           COLUMN_NAME          N NUM_BUCKETS NUM_DISTINCT
-------------------- -------------------- - ----------- ------------
T                    A                    Y           1        10000
T                    B                    Y           1           10

Elapsed: 00:00:00.31
rajesh@9IR2>

for Histograms it is,

rajesh@9IR2> SELECT table_name, column_name, endpoint_number, endpoint_value
2 FROM user_tab_histograms
3 WHERE table_name ='T'
4 order by column_name,endpoint_number
5 /

TABLE_NAME           COLUMN_NAME          ENDPOINT_NUMBER ENDPOINT_VALUE
-------------------- -------------------- --------------- --------------
T                    A                                  0              1
T                    A                                  1          10000
T                    B                                  0              1
T                    B                                  1             99

Elapsed: 00:00:00.04
rajesh@9IR2>

Analyze has created 1 bucket ( since the DEFAULT value for method_opt parameter is 'FOR ALL COLUMNS SIZE 1' ) for each column, so all values for the column are in same bucket. ENDPOINT_NUMBER represents Bucket Number and ENDPOINT_VALUE represents last column value present in the bucket.

Now both query ( query(1) & query(2) ) will do a FULL TABLE SCAN.

rajesh@9IR2> set autotrace traceonly explain;
rajesh@9IR2> select * from t where b=99;

Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=1000 Bytes=6000)
   1    0   TABLE ACCESS (FULL) OF 'T' (Cost=3 Card=1000 Bytes=6000)

rajesh@9IR2> select * from t where b=3;

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=1000 Bytes=6000)
   1    0   TABLE ACCESS (FULL) OF 'T' (Cost=3 Card=1000 Bytes=6000)

rajesh@9IR2>
rajesh@9IR2> set autotrace off;

So, the fact that you have statistics about the table and columns does not help the optimizer to distinguish between how many of each value we have. The reason it does a FULL TABLE SCAN is because there is a 1 BUCKET histogram and any value selected for should be in that bucket.

Now we need to create Histograms (based on the Skewed data) that will be really help full to the optimizer to come up with Optimal plan. ( SIZE can be given as SIZE 10 if you know exact data values or AUTO - Will helps Oracle determines the columns to collect histograms based on data distribution and the workload of the columns )

rajesh@9IR2> EXEC dbms_stats.gather_table_stats(ownname=>user,tabname=>'T',method_opt=>'FOR ALL COLUMNS SIZE AUTO');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
rajesh@9IR2>
rajesh@9IR2>
rajesh@9IR2> select table_name, column_name, nullable, num_buckets, num_distinct
2 from user_tab_columns
3 where table_name ='T'
4 /

TABLE_NAME           COLUMN_NAME          N NUM_BUCKETS NUM_DISTINCT
-------------------- -------------------- - ----------- ------------
T                    A                    Y           1        10000
T                    B                    Y           9           10

Elapsed: 00:00:00.01
rajesh@9IR2>
rajesh@9IR2> SELECT table_name, column_name, endpoint_number, endpoint_value
2 FROM user_tab_histograms
3 WHERE table_name ='T'
4 order by column_name,endpoint_number
5 /

TABLE_NAME           COLUMN_NAME          ENDPOINT_NUMBER ENDPOINT_VALUE
-------------------- -------------------- --------------- --------------
T                    A                                  0              1
T                    A                                  1          10000
T                    B                                  1              1
T                    B                                  2              2
T                    B                                  3              3
T                    B                                  4              4
T                    B                                  5              5
T                    B                                  6              6
T                    B                                  7              7
T                    B                                  8              8
T                    B                                  9              9
T                    B                              10000             99

12 rows selected.

Elapsed: 00:00:00.37
rajesh@9IR2>

Analyze has created 1 bucket for column 'A' having all 10000 unique values in one bucket and 9 buckets for column 'B'.

The ENDPOINT_VALUE shows the column value and the ENDPOINT_NUMBER shows the cumulative number of rows.

For example, for ENDPOINT_VALUE 2 for column 'B' , it has an ENDPOINT_NUMBER 2, the previous ENDPOINT_NUMBER is 1, hence the number of rows with value 2 is 1.

Simillarly ENDPOINT_VALUE 99. Its ENDPOINT_NUMBER is 10000, The previous bucket ENDPOINT_NUMBER is 9, so 10000 - 9 = 9991 rows containing value of '99'.

Now the Query(1) will do a FULL TABLE SCAN and Query(2) will do a INDEX RANGE SCAN like below.

rajesh@9IR2> set autotrace traceonly explain;
rajesh@9IR2> select * from t where b=99;

Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=9991 Bytes=59946)
   1    0   TABLE ACCESS (FULL) OF 'T' (Cost=3 Card=9991 Bytes=59946)

rajesh@9IR2> select * from t where b=3;

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=6)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=2 Card=1 Bytes=6)
   2    1     INDEX (RANGE SCAN) OF 'T_IND' (NON-UNIQUE) (Cost=1 Card=1)

rajesh@9IR2>
rajesh@9IR2>

This is fine, if we have a low distinct values, but there can be tables with huge number of distinct values, you dont want to create a bucket for each value.

rajesh@9IR2> EXEC dbms_stats.gather_table_stats(ownname=>user,tabname=>'T',method_opt=>'FOR ALL COLUMNS SIZE 5');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.14
rajesh@9IR2>
rajesh@9IR2>
rajesh@9IR2> select table_name, column_name, nullable, num_buckets, num_distinct
2 from user_tab_columns
3 where table_name ='T'
4 /

TABLE_NAME           COLUMN_NAME          N NUM_BUCKETS NUM_DISTINCT
-------------------- -------------------- - ----------- ------------
T                    A                    Y           5        10000
T                    B                    Y           1           10

Elapsed: 00:00:00.03
rajesh@9IR2>
rajesh@9IR2>
rajesh@9IR2> SELECT table_name, column_name, endpoint_number, endpoint_value
2 FROM user_tab_histograms
3 WHERE table_name ='T'
4 order by column_name,endpoint_number
5 /

TABLE_NAME           COLUMN_NAME          ENDPOINT_NUMBER ENDPOINT_VALUE
-------------------- -------------------- --------------- --------------
T                    A                                  0              1
T                    A                                  1           2000
T                    A                                  2           4000
T                    A                                  3           6000
T                    A                                  4           8000
T                    A                                  5          10000
T                    B                                  0              1
T                    B                                  5             99

8 rows selected.

Elapsed: 00:00:00.04
rajesh@9IR2>
rajesh@9IR2>

Now we have 5 buckets for column 'A' where data ( 10000 / 5 = 2000 )  gets equially distributed.

rajesh@9IR2> select 10000 / 5 from dual;

   10000/5
----------
      2000

For column 'B', bucket '0' holds the lowest value. you cannot see the buckets 1 to bucket 4 so as to save space.

rajesh@9IR2> set autotrace traceonly explain;
rajesh@9IR2> select * from t where b=99;

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=9000 Bytes=54000)
   1    0   TABLE ACCESS (FULL) OF 'T' (Cost=3 Card=9000 Bytes=54000)

rajesh@9IR2> select * from t where b=3;

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=6)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=2 Card=1 Bytes=6)
   2    1     INDEX (RANGE SCAN) OF 'T_IND' (NON-UNIQUE) (Cost=1 Card=1)

rajesh@9IR2>
rajesh@9IR2> set autotrace off;

Oracle 10g Optimizer is pretty good in handling Histograms. Repeating the same snippet in 10GR2 database I see ( since the DEFAULT value for method_opt parameter is 'FOR ALL COLUMNS SIZE AUTO' )

(For demonstration i disable dynamic sampling at session level, to show how thing keep going )

rajesh@10GR2> alter session set optimizer_dynamic_sampling=0;

Session altered.

Elapsed: 00:00:00.01
rajesh@10GR2>

rajesh@10GR2> create table t
2 as
3 select rownum as a,
4        case when rownum <=9 then rownum
5                     else 99 end as b
6 from dual
7 connect by level <=10000;

Table created.

Elapsed: 00:00:00.35
rajesh@10GR2> set autotrace traceonly explain;
rajesh@10GR2> select * from t where b=99;

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation         | Name | Rows | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |    17 |   442 |     6   (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T    |    17 |   442 |     6   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("B"=99)

rajesh@10GR2> select * from t where b=3;

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation         | Name | Rows | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |    17 |   442 |     6   (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T    |    17 |   442 |     6   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("B"=3)

rajesh@10GR2>
Both the above queries would use a FULL TABLE SCAN as there is no other access method available.

rajesh@10GR2> create index t_ind on t(b) nologging;

Index created.

Elapsed: 00:00:00.12
rajesh@10GR2>
rajesh@10GR2> set autotrace off;
rajesh@10GR2>
rajesh@10GR2> select table_name, column_name, nullable, num_buckets, num_distinct
2 from user_tab_columns
3 where table_name ='T'
4 /

TABLE_NAME                     COLUMN_NAME                    N NUM_BUCKETS NUM_DISTINCT
------------------------------ ------------------------------ - ----------- ------------
T                              A                              Y
T                              B                              Y

Elapsed: 00:00:00.04
rajesh@10GR2>
rajesh@10GR2> set autotrace traceonly explain;
rajesh@10GR2> select * from t where b=99;
Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1376202287
-------------------------------------------------------------------------------------
| Id | Operation                   | Name | Rows | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    17 |   442 |     3   (0)| 00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID| T     |    17 |   442 |     3   (0)| 00:00:01 |
|* 2 |   INDEX RANGE SCAN          | T_IND |     7 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("B"=99)

rajesh@10GR2> select * from t where b=3;

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1376202287
-------------------------------------------------------------------------------------
| Id | Operation                   | Name | Rows | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    17 |   442 |     3   (0)| 00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID| T     |    17 |   442 |     3   (0)| 00:00:01 |
|* 2 |   INDEX RANGE SCAN          | T_IND |     7 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("B"=3)

rajesh@10GR2>
rajesh@10GR2> SELECT table_name, column_name, endpoint_number, endpoint_value
2 FROM user_tab_histograms
3 WHERE table_name ='T'
4 /

no rows selected

Elapsed: 00:00:00.01
rajesh@10GR2>
rajesh@10GR2>

'method_opt' parameter in gather_table_stats method has default value of FOR ALL COLUMNS SIZE AUTO - this helps Oracle determines the columns to collect histograms based on data distribution.

rajesh@10GR2> exec dbms_stats.gather_table_stats(user,'T');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.37
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> select table_name, column_name, nullable, num_buckets, num_distinct
2 from user_tab_columns
3 where table_name ='T'
4 /

TABLE_NAME           COLUMN_NAME          N NUM_BUCKETS NUM_DISTINCT
-------------------- -------------------- - ----------- ------------
T                    A                    Y           1        10000
T                    B                    Y          10           10

Elapsed: 00:00:00.01
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> SELECT table_name, column_name, endpoint_number, endpoint_value
2 FROM user_tab_histograms
3 WHERE table_name ='T'
4 order by column_name,endpoint_number
5 /

TABLE_NAME           COLUMN_NAME          ENDPOINT_NUMBER ENDPOINT_VALUE
-------------------- -------------------- --------------- --------------
T                    A                                  0              1
T                    A                                  1          10000
T                    B                                  1              1
T                    B                                  2              2
T                    B                                  3              3
T                    B                                  4              4
T                    B                                  5              5
T                    B                                  6              6
T                    B                                  7              7
T                    B                                  8              8
T                    B                                  9              9
T                    B                              10000             99

12 rows selected.

Elapsed: 00:00:00.07
rajesh@10GR2>

Now the query(1) does Full Table scan (due to massively skewed data on column 'b') and query(1) does an Index Range scan.

rajesh@10GR2> set autotrace traceonly explain;
rajesh@10GR2> select * from t where b=99;

Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation         | Name | Rows | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      | 9991 | 59946 |     6   (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T    | 9991 | 59946 |     6   (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("B"=99)

rajesh@10GR2> select * from t where b=3;

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1376202287
-------------------------------------------------------------------------------------
| Id | Operation                   | Name | Rows | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |     6 |     2   (0)| 00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID| T     |     1 |     6 |     2   (0)| 00:00:01 |
|* 2 |   INDEX RANGE SCAN          | T_IND |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("B"=3)

rajesh@10GR2>
rajesh@10GR2>

Books I follow...

Advanced SQL Functions in Oracle 10g - by Richard Walsh Earp
Cost-Based Oracle Fundamentals - By Jonathan Lewis
Effective Oracle by Design - by Thomas kyte
Expert One on One Oracle - by Thomas kyte
Expert Oracle - by Thomas kyte
Expert Oracle Database Architecture Oracle Database 9i, 10g - Thomas Kyte
Expert Oracle Database Architecture Oracle Database 9i, 10g, and 11g Programming Techniques and Solutions - Thomas Kyte
Expert Oracle Practices Oracle Database Administration from the Oak Table
High-Performance Oracle Database Applications - by Donald K. Burleson
Mastering Oracle PL/SQL: Practical Solutions - by Connor McDonald
Mastering Oracle Scheduler in Oracle 11g Databases - by Ronald Rood
Mastering Oracle SQL and SQL*Plus - By LEX DE HAAN
Optimizing Oracle Performance - By Cary Millsap
Oracle Database 10g Performance Tuning Tips & Techniques - By Richard J. Niemiec
Oracle Database 10g XML & SQL: Design, Build & Manage XML Applications in Java, C, C++ & PL/SQL - by Mark V. Scardina, Ben Chang and Jinyu Wang
Oracle Database 10g: The Complete Reference - by Kevin Loney
Oracle Database 11g DBA Handbook - by Bob Bryla , Kevin Loney
Oracle Database 11g New Features - Maximize the new capabilities of the Latest Database Release - by Robert G. Freeman
Oracle Performance Tuning for 10gR2 - By Gavin Powell
Oracle PL/SQL for DBAs By Steven Feuerstein, Arup Nanda
Oracle SQL*Plus: The Definitive Guide - By Jonathan Gennick
Oracle Wait Interface: A Practical Guide to Performance Diagnostics & Tuning by Richmond Shee, Kirtikumar Deshpande and K Gopalakrishnan
Oracle8i Internal Services for Waits, Latches, Locks, and Memory - by Steve Adams
Professional Oracle 8i Application Programming - By Thomas Kyte
Secrets of the Oracle Database - By Norbert Debes
Troubleshooting Oracle Performance - By Christian Antognini
Tuning Third-party Vendor Oracle systems Tuning when you can't touch the code - By Mike Ault
Using Oracle SQL Stored Outlines & Optimizer Plan Stability - By Mike Ault