Saturday, January 29, 2011

PARTITION RANGE SUBQUERY - New Access Path in 10G

I saw this and found it interesting.

If the table with the WHERE predicate is relatively small compared to the partitioned table, and the expected reduction of records or partitions for the partitioned table is significant, Oracle will perform dynamic partition pruning using a recursive sub-query

Here is the snippet of code to demonstrate that.

rajesh@10GR2> create table sales
  2  (
  3     SALES_ID                ,
  4     CUST_ID                 ,
  5     QUANTITY_SOLD   ,
  6     AMOUNT_SOLD             ,
  7     CUST_DATE
  8  )
  9  partition by range(CUST_DATE)
 10  subpartition by hash(cust_id)
 11  subpartition template
 12  (
 13    subpartition sp1,
 14    subpartition sp2,
 15    subpartition sp3,
 16    subpartition sp4
 17  )
 18  (
 19    partition p_q1_2005  values less than ( to_date('01-apr-2005','dd-mon-yyyy') ),
 20    partition p_q2_2005  values less than ( to_date('01-jul-2005','dd-mon-yyyy') ),
 21    partition p_q3_2005  values less than ( to_date('01-oct-2005','dd-mon-yyyy') ),
 22    partition p_q4_2005  values less than ( to_date('01-jan-2006','dd-mon-yyyy') ),
 23    partition p_q1_2006  values less than ( to_date('01-apr-2006','dd-mon-yyyy') ),
 24    partition p_q2_2006  values less than ( to_date('01-jul-2006','dd-mon-yyyy') ),
 25    partition p_q3_2006  values less than ( to_date('01-oct-2006','dd-mon-yyyy') ),
 26    partition p_q4_2006  values less than ( to_date('01-jan-2007','dd-mon-yyyy') ),
 27    partition p_q1_2007  values less than ( to_date('01-apr-2007','dd-mon-yyyy') ),
 28    partition p_q2_2007  values less than ( to_date('01-jul-2007','dd-mon-yyyy') ),
 29    partition p_q3_2007  values less than ( to_date('01-oct-2007','dd-mon-yyyy') ),
 30    partition p_q4_2007  values less than ( to_date('01-jan-2008','dd-mon-yyyy') ),
 31    partition p_q1_2008  values less than ( to_date('01-apr-2008','dd-mon-yyyy') ),
 32    partition p_q2_2008  values less than ( to_date('01-jul-2008','dd-mon-yyyy') ),
 33    partition p_q3_2008  values less than ( to_date('01-oct-2008','dd-mon-yyyy') ),
 34    partition p_q4_2008  values less than ( to_date('01-jan-2009','dd-mon-yyyy') ),
 35    partition p_q1_2009  values less than ( to_date('01-apr-2009','dd-mon-yyyy') ),
 36    partition p_q2_2009  values less than ( to_date('01-jul-2009','dd-mon-yyyy') ),
 37    partition p_q3_2009  values less than ( to_date('01-oct-2009','dd-mon-yyyy') ),
 38    partition p_q4_2009  values less than ( to_date('01-jan-2010','dd-mon-yyyy') ),
 39    partition p_q1_2010  values less than ( to_date('01-apr-2010','dd-mon-yyyy') ),
 40    partition p_q2_2010  values less than ( to_date('01-jul-2010','dd-mon-yyyy') ),
 41    partition p_q3_2010  values less than ( to_date('01-oct-2010','dd-mon-yyyy') ),
 42    partition p_q4_2010  values less than ( to_date('01-jan-2011','dd-mon-yyyy') ),
 43    partition p_q1_2011  values less than ( to_date('01-apr-2011','dd-mon-yyyy') ),
 44    partition p_q2_2011  values less than ( to_date('01-jul-2011','dd-mon-yyyy') ),
 45    partition p_q3_2011  values less than ( to_date('01-oct-2011','dd-mon-yyyy') ),
 46    partition p_q4_2011  values less than ( to_date('01-jan-2012','dd-mon-yyyy') )
 47  )
 48  nologging
 49  as
 50  select             SALES_ID         ,
 51                     CUST_ID          ,
 52                     QUANTITY_SOLD    ,
 53                     AMOUNT_SOLD      ,
 54                     trunc(CUST_DATE)
 55  from sales_data;

Table created.

Elapsed: 00:01:50.23
rajesh@10GR2>
rajesh@10GR2> select count(*) from sales;

  COUNT(*)
----------
  11863049

Elapsed: 00:00:35.03
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> exec dbms_stats.gather_table_stats(user,'SALES');

PL/SQL procedure successfully completed.

Elapsed: 00:03:51.75
rajesh@10GR2>
rajesh@10GR2> create index sales_idx_01 on sales(sales_id) nologging;

Index created.

Elapsed: 00:02:29.23
rajesh@10GR2>
rajesh@10GR2> create table times
  2  nologging
  3  as
  4  select dt, 'Q'||to_char(dt,'Q')||'_'||to_char(dt,'YYYY') as fisc_quater
  5  from (
  6     select to_date('01-JAN-2005','dd-mon-yyyy') + level - 1  as dt
  7     from dual
  8     connect by level < = 2500
  9  )
 10  /

Table created.

Elapsed: 00:00:00.11
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> exec dbms_stats.gather_table_stats(user,'TIMES');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.14
rajesh@10GR2>
rajesh@10GR2> set autotrace traceonly explain;
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> SELECT s.sales_id, s.amount_sold, s.quantity_sold, times.dt
  2  from sales s, times
  3  where times.dt = cust_date
  4  and  times.fisc_quater = 'Q3_2005'
  5  /

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1453270267

---------------------------------------------------------------------------------
| Id  | Operation                 | Name  | Rows  | Cost (%CPU)|  Pstart| Pstop |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |       |  6230K| 12218   (3)|        |       |
|*  1 |  HASH JOIN                |       |  6230K| 12218   (3)|      |       |
|*  2 |   TABLE ACCESS FULL       | TIMES |    89 |     4   (0)|   |       |
|   3 |   PARTITION RANGE SUBQUERY|       |    11M| 12125   (2)| KEY(SQ)|KEY(SQ)|
|   4 |    PARTITION HASH ALL     |       |    11M| 12125   (2)|      1 |     4 |
|   5 |     TABLE ACCESS FULL     | SALES |    11M| 12125   (2)|    KEY |   KEY |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("TIMES"."DT"="CUST_DATE")
   2 - filter("TIMES"."FISC_QUATER"='Q3_2005')

rajesh@10GR2>
rajesh@10GR2> set autotrace off;

and Tkprof show me this,

SELECT s.sales_id, s.amount_sold, s.quantity_sold, times.dt
from sales s, times
where times.dt = cust_date
and  times.fisc_quater = 'Q3_2005'

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    68435     20.78      44.02      21906     114602          0    10265035
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    68437     20.79      44.03      21906     114602          0    10265035

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100 

Rows     Row Source Operation
-------  ---------------------------------------------------
10265035  HASH JOIN  (cr=114612 pr=21906 pw=0 time=102762307 us)
     92   TABLE ACCESS FULL TIMES (cr=10 pr=0 pw=0 time=158 us)
10265035   PARTITION RANGE SUBQUERY PARTITION: KEY(SUBQUERY) KEY(SUBQUERY) (cr=114602 pr=21906
10265035    PARTITION HASH ALL PARTITION: 1 4 (cr=114592 pr=21906 pw=0 time=51325251 us)
10265035     TABLE ACCESS FULL SALES PARTITION: KEY KEY (cr=114592 pr=21906 pw=0 time=41082


and Tkprof with Explain plan option shows me this,


SELECT s.sales_id, s.amount_sold, s.quantity_sold, times.dt
from sales s, times
where times.dt = cust_date
and  times.fisc_quater = 'Q3_2005'

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    68435     20.78      44.02      21906     114602          0    10265035
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    68437     20.79      44.03      21906     114602          0    10265035

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100  (RAJESH)

Rows     Row Source Operation
-------  ---------------------------------------------------
10265035  HASH JOIN  (cr=114612 pr=21906 pw=0 time=102762307 us)
     92   TABLE ACCESS FULL TIMES (cr=10 pr=0 pw=0 time=158 us)
10265035   PARTITION RANGE SUBQUERY PARTITION: KEY(SUBQUERY) KEY(SUBQUERY) (cr=114602 pr=21906
10265035    PARTITION HASH ALL PARTITION: 1 4 (cr=114592 pr=21906 pw=0 time=51325251 us)
10265035     TABLE ACCESS FULL SALES PARTITION: KEY KEY (cr=114592 pr=21906 pw=0 time=41082


Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   MODE: ALL_ROWS
10265035   HASH JOIN
     92    TABLE ACCESS   MODE: ANALYZED (FULL) OF 'TIMES' (TABLE)
10265035    PARTITION RANGE (SUBQUERY) PARTITION:KEY(SUBQUERY)KEY(SUBQUERY)
              SELECT distinct TBL$OR$IDX$PART$NUM("SALES", 0, d#, p#,
              "CUST_DATE") FROM (SELECT "TIMES"."DT" "CUST_DATE" FROM
              "TIMES" "TIMES" WHERE "TIMES"."FISC_QUATER"='Q3_2005') ORDER
                BY 1
10265035     PARTITION HASH (ALL) PARTITION: START=1 STOP=4
10265035      TABLE ACCESS   MODE: ANALYZED (FULL) OF 'SALES' (TABLE)
                 PARTITION:KEYKEY


As you see, Oracle has dynamically constructed a co-related subquery and used its results to drive this PARTITION RANGE (SUBQUERY)  access path.

SELECT distinct TBL$OR$IDX$PART$NUM("SALES", 0, 1, 0, "CUST_DATE") FROM (SELECT "TIMES"."DT" "CUST_DATE" FROM "TIMES" "TIMES" WHERE
 "TIMES"."FISC_QUATER"='Q3_2005') ORDER BY 1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0         10          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.01       0.01          0         10          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100  (RAJESH)   (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT UNIQUE (cr=10 pr=0 pw=0 time=329 us)
     92   TABLE ACCESS FULL TIMES (cr=10 pr=0 pw=0 time=152 us)

Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   MODE: ALL_ROWS
      1   SORT (UNIQUE)
     92    TABLE ACCESS   MODE: ANALYZED (FULL) OF 'TIMES' (TABLE)


I was wondering why should oracle use HASH JOIN since one of the data source in plan (table TIMES having 84 rows) is providing a low cardinality. Instead use NESTED LOOPS ( pick one table having low cardinality and iterate its rows over the other table and find matching rows )


Now, Forcing Nested Loops hint in query plan changes like this.


rajesh@10GR2> SELECT /*+ use_nl(s,t) */ s.sales_id, s.amount_sold, s.quantity_sold, t.dt
  2  from sales s, times t
  3  where t.dt = s.cust_date
  4  and  t.fisc_quater = 'Q3_2005'
  5  /

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1526662114
---------------------------------------------------------------------------------
| Id  | Operation                 | Name  | Rows  | Cost (%CPU)|  Pstart| Pstop |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |       |  6230K| 38540   (2)|        |       |
|   1 |  NESTED LOOPS             |       |  6230K| 38540   (2)|        |       |
|*  2 |   TABLE ACCESS FULL       | TIMES |    89 |     4   (0)|        |       |
|   3 |   PARTITION RANGE ITERATOR|       | 69786 |   433   (2)|    KEY |   KEY |
|   4 |    PARTITION HASH ALL     |       | 69786 |   433   (2)|      1 |     4 |
|*  5 |     TABLE ACCESS FULL     | SALES | 69786 |   433   (2)|    KEY |   KEY |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T"."FISC_QUATER"='Q3_2005')
   5 - filter("T"."DT"="S"."CUST_DATE")

rajesh@10GR2>


Now, the access path changes to PARTITION RANGE ITERATOR and its no more PARTITION RANGE SUBQUERY. Tkprof shows me this,


SELECT /*+ use_nl(s,t) */ s.sales_id, s.amount_sold, s.quantity_sold, t.dt
from sales s, times t
where t.dt = s.cust_date
and  t.fisc_quater = 'Q3_2005'

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          0          2           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    68435    189.48    3799.59    3289938    4344283          0    10265035
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    68437    189.50    3799.60    3289938    4344283          2    10265035

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100  (RAJESH)


Thats a Huge difference ! using HASH join it took 44 sec but using NESTED LOOPS it took 3800 secs (almost 1 hrs, 3 mins).

Replicated these table's ( SALES, TIMES) to 9iR2 database and I dont see this PARTITION RANGE SUBQUERY  happening there. Its nowhere available in 9i product documentation.

rajesh@9IR2> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.8.0 - Production
PL/SQL Release 9.2.0.8.0 - Production
CORE    9.2.0.8.0       Production
TNS for 32-bit Windows: Version 9.2.0.8.0 - Production
NLSRTL Version 9.2.0.8.0 - Production

Elapsed: 00:00:00.01
rajesh@9IR2>
rajesh@9IR2> SET AUTOTRACE TRACEONLY EXPLAIN;
rajesh@9IR2>
rajesh@9IR2> SELECT s.sales_id, s.quantity_sold, s.amount_sold, t.dt
  2  from sales s, times t
  3  where s.cust_date = t.dt
  4  and t.fisc_quater = 'Q3_2005'
  5  /

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5407 Card=3143029 Bytes=122578131)
   1    0   HASH JOIN (Cost=5407 Card=3143029 Bytes=122578131)
   2    1     TABLE ACCESS (FULL) OF 'TIMES' (Cost=3 Card=89 Bytes=1424)
   3    1     PARTITION RANGE (ITERATOR)   4    3       PARTITION HASH (ALL)
   5    4         TABLE ACCESS (FULL) OF 'SALES' (Cost=5234 Card=11863049 Bytes=272850127)

rajesh@9IR2>
rajesh@9IR2> SET AUTOTRACE OFF;
rajesh@9IR2>

And Tkprof shows me this,

SELECT s.sales_id, s.quantity_sold, s.amount_sold, t.dt
from sales s, times t
where s.cust_date = t.dt
and t.fisc_quater = 'Q3_2005'

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    68435     12.43      45.66      45959     114727          0    10265035
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    68437     12.43      45.66      45959     114727          0    10265035

Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: 107 

Rows     Row Source Operation
-------  ---------------------------------------------------
10265035  HASH JOIN 
     92   TABLE ACCESS FULL TIMES
10265035   PARTITION RANGE ITERATOR PARTITION: KEY KEY
10265035    PARTITION HASH ALL PARTITION: 1 4
10265035     TABLE ACCESS FULL SALES PARTITION: KEY KEY

Now, I am wondering really why this PARTITION RANGE SUBQUERY access path is introduced in 10g database still Oracle 9i can do that efficiently without having PARTITION RANGE SUBQUERY in place.

No comments:

Post a Comment