A full partition-wise join divides a large join into smaller joins between a pair of partitions from the two joined tables. To use this feature, you must equipartition both tables on their join keys. For example, consider a large join between a sales table and a customer table on the column cust_id, the query " find the sales details for all customers who are effective ( having cust_date in Q3 2005 ) in Q3 2005 " is a typical example of a SQL Statement performing such a join is
SELECT c.* , s.*
from customers c, sales s
where c.cust_id = s.cust_id
and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
and to_date('30-sep-2005','dd-mon-yyyy');
If we equipartition both customers and sales table based on this join keys like below.
rajesh@10GR2> create table customers
2 (
3 CUST_ID ,
4 CUST_FIRST_NAME ,
5 CUST_LAST_NAME ,
6 CUST_GENDER ,
7 CUST_DATE ,
8 CUST_POSTAL_CODE ,
9 CUST_CITY ,
10 CUST_TOTAL
11 )
12 partition by range(CUST_DATE)
13 subpartition by hash(cust_id)
14 subpartition template
15 (
16 subpartition sp1,
17 subpartition sp2,
18 subpartition sp3,
19 subpartition sp4
20 )
21 (
22 partition p_q1_2005 values less than ( to_date('01-apr-2005','dd-mon-yyyy') ),
23 partition p_q2_2005 values less than ( to_date('01-jul-2005','dd-mon-yyyy') ),
24 partition p_q3_2005 values less than ( to_date('01-oct-2005','dd-mon-yyyy') ),
25 partition p_q4_2005 values less than ( to_date('01-jan-2006','dd-mon-yyyy') ),
26 partition p_q1_2006 values less than ( to_date('01-apr-2006','dd-mon-yyyy') ),
27 partition p_q2_2006 values less than ( to_date('01-jul-2006','dd-mon-yyyy') ),
28 partition p_q3_2006 values less than ( to_date('01-oct-2006','dd-mon-yyyy') ),
29 partition p_q4_2006 values less than ( to_date('01-jan-2007','dd-mon-yyyy') ),
30 partition p_q1_2007 values less than ( to_date('01-apr-2007','dd-mon-yyyy') ),
31 partition p_q2_2007 values less than ( to_date('01-jul-2007','dd-mon-yyyy') ),
32 partition p_q3_2007 values less than ( to_date('01-oct-2007','dd-mon-yyyy') ),
33 partition p_q4_2007 values less than ( to_date('01-jan-2008','dd-mon-yyyy') ),
34 partition p_q1_2008 values less than ( to_date('01-apr-2008','dd-mon-yyyy') ),
35 partition p_q2_2008 values less than ( to_date('01-jul-2008','dd-mon-yyyy') ),
36 partition p_q3_2008 values less than ( to_date('01-oct-2008','dd-mon-yyyy') ),
37 partition p_q4_2008 values less than ( to_date('01-jan-2009','dd-mon-yyyy') ),
38 partition p_q1_2009 values less than ( to_date('01-apr-2009','dd-mon-yyyy') ),
39 partition p_q2_2009 values less than ( to_date('01-jul-2009','dd-mon-yyyy') ),
40 partition p_q3_2009 values less than ( to_date('01-oct-2009','dd-mon-yyyy') ),
41 partition p_q4_2009 values less than ( to_date('01-jan-2010','dd-mon-yyyy') ),
42 partition p_q1_2010 values less than ( to_date('01-apr-2010','dd-mon-yyyy') ),
43 partition p_q2_2010 values less than ( to_date('01-jul-2010','dd-mon-yyyy') ),
44 partition p_q3_2010 values less than ( to_date('01-oct-2010','dd-mon-yyyy') ),
45 partition p_q4_2010 values less than ( to_date('01-jan-2011','dd-mon-yyyy') ),
46 partition p_q1_2011 values less than ( to_date('01-apr-2011','dd-mon-yyyy') ),
47 partition p_q2_2011 values less than ( to_date('01-jul-2011','dd-mon-yyyy') ),
48 partition p_q3_2011 values less than ( to_date('01-oct-2011','dd-mon-yyyy') ),
49 partition p_q4_2011 values less than ( to_date('01-jan-2012','dd-mon-yyyy') )
50 )
51 nologging
52 as
53 select CUST_ID ,
54 CUST_FIRST_NAME ,
55 CUST_LAST_NAME ,
56 CUST_GENDER ,
57 CUST_DATE ,
58 CUST_POSTAL_CODE ,
59 CUST_CITY ,
60 CUST_TOTAL
61 from customers_data;
Table created.
Elapsed: 00:00:16.24
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> exec dbms_stats.gather_table_stats(user,'CUSTOMERS');
PL/SQL procedure successfully completed.
Elapsed: 00:00:08.64
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> create table sales
2 (
3 SALES_ID ,
4 CUST_ID ,
5 QUANTITY_SOLD ,
6 AMOUNT_SOLD ,
7 CUST_DATE
8 )
9 partition by range(CUST_DATE)
10 subpartition by hash(cust_id)
11 subpartition template
12 (
13 subpartition sp1,
14 subpartition sp2,
15 subpartition sp3,
16 subpartition sp4
17 )
18 (
19 partition p_q1_2005 values less than ( to_date('01-apr-2005','dd-mon-yyyy') ),
20 partition p_q2_2005 values less than ( to_date('01-jul-2005','dd-mon-yyyy') ),
21 partition p_q3_2005 values less than ( to_date('01-oct-2005','dd-mon-yyyy') ),
22 partition p_q4_2005 values less than ( to_date('01-jan-2006','dd-mon-yyyy') ),
23 partition p_q1_2006 values less than ( to_date('01-apr-2006','dd-mon-yyyy') ),
24 partition p_q2_2006 values less than ( to_date('01-jul-2006','dd-mon-yyyy') ),
25 partition p_q3_2006 values less than ( to_date('01-oct-2006','dd-mon-yyyy') ),
26 partition p_q4_2006 values less than ( to_date('01-jan-2007','dd-mon-yyyy') ),
27 partition p_q1_2007 values less than ( to_date('01-apr-2007','dd-mon-yyyy') ),
28 partition p_q2_2007 values less than ( to_date('01-jul-2007','dd-mon-yyyy') ),
29 partition p_q3_2007 values less than ( to_date('01-oct-2007','dd-mon-yyyy') ),
30 partition p_q4_2007 values less than ( to_date('01-jan-2008','dd-mon-yyyy') ),
31 partition p_q1_2008 values less than ( to_date('01-apr-2008','dd-mon-yyyy') ),
32 partition p_q2_2008 values less than ( to_date('01-jul-2008','dd-mon-yyyy') ),
33 partition p_q3_2008 values less than ( to_date('01-oct-2008','dd-mon-yyyy') ),
34 partition p_q4_2008 values less than ( to_date('01-jan-2009','dd-mon-yyyy') ),
35 partition p_q1_2009 values less than ( to_date('01-apr-2009','dd-mon-yyyy') ),
36 partition p_q2_2009 values less than ( to_date('01-jul-2009','dd-mon-yyyy') ),
37 partition p_q3_2009 values less than ( to_date('01-oct-2009','dd-mon-yyyy') ),
38 partition p_q4_2009 values less than ( to_date('01-jan-2010','dd-mon-yyyy') ),
39 partition p_q1_2010 values less than ( to_date('01-apr-2010','dd-mon-yyyy') ),
40 partition p_q2_2010 values less than ( to_date('01-jul-2010','dd-mon-yyyy') ),
41 partition p_q3_2010 values less than ( to_date('01-oct-2010','dd-mon-yyyy') ),
42 partition p_q4_2010 values less than ( to_date('01-jan-2011','dd-mon-yyyy') ),
43 partition p_q1_2011 values less than ( to_date('01-apr-2011','dd-mon-yyyy') ),
44 partition p_q2_2011 values less than ( to_date('01-jul-2011','dd-mon-yyyy') ),
45 partition p_q3_2011 values less than ( to_date('01-oct-2011','dd-mon-yyyy') ),
46 partition p_q4_2011 values less than ( to_date('01-jan-2012','dd-mon-yyyy') )
47 )
48 nologging
49 as
50 select SALES_ID ,
51 CUST_ID ,
52 QUANTITY_SOLD ,
53 AMOUNT_SOLD ,
54 CUST_DATE
55 from sales_data;
Table created.
Elapsed: 00:07:03.90
rajesh@10GR2> alter table customers add constraint cust_pk primary key(cust_id);
Table altered.
Elapsed: 00:00:11.01
rajesh@10GR2> alter table sales add constraint sales_fk foreign key(cust_id) references customers;
Table altered.
Elapsed: 00:00:19.17
rajesh@10GR2> select count(*) from sales;
COUNT(*)
----------
11863049
Elapsed: 00:01:12.48
rajesh@10GR2> select count(*) from customers;
COUNT(*)
----------
1026072
Elapsed: 00:00:00.71
rajesh@10GR2>
rajesh@10GR2> SELECT c.* , s.*
2 from customers c, sales s
3 where c.cust_id = s.cust_id
4 and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
5 and to_date('30-sep-2005','dd-mon-yyyy');
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 3704956898
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Pstart| Pstop |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11M| 39198 (2)| | |
| 1 | PARTITION HASH ALL | | 11M| 39198 (2)| 1 | 4 |
|* 2 | HASH JOIN | | 11M| 39198 (2)| | |
| 3 | PARTITION RANGE SINGLE| | 929K| 1875 (2)| 3 | 3 |
|* 4 | TABLE ACCESS FULL | CUSTOMERS | 929K| 1875 (2)| 9 | 12 |
| 5 | PARTITION RANGE ALL | | 11M| 12125 (2)| 1 | 28 |
| 6 | TABLE ACCESS FULL | SALES | 11M| 12125 (2)| 1 | 112 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("C"."CUST_ID"="S"."CUST_ID")
4 - filter("C"."CUST_DATE"<=TO_DATE('2005-09-30 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
rajesh@10GR2>
SELECT c.* , s.*
from customers c, sales s
where c.cust_id = s.cust_id
and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
and to_date('30-sep-2005','dd-mon-yyyy')
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 68435 52.31 83.57 47086 130237 0 10265035
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 68437 52.31 83.59 47086 130237 0 10265035
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100
Rows Row Source Operation
------- ---------------------------------------------------
10265035 PARTITION HASH ALL PARTITION: 1 4 (cr=130237 pr=47086 pw=744 time=64025850 us)
10265035 HASH JOIN (cr=130237 pr=47086 pw=744 time=54649725 us)
884070 PARTITION RANGE SINGLE PARTITION: 3 3 (cr=8054 pr=0 pw=0 time=1768764 us)
884070 TABLE ACCESS FULL CUSTOMERS PARTITION: 9 12 (cr=8054 pr=0 pw=0 time=884626 us)
11863049 PARTITION RANGE ALL PARTITION: 1 28 (cr=122183 pr=46342 pw=0 time=71307400 us)
11863049 TABLE ACCESS FULL SALES PARTITION: 1 112 (cr=122183 pr=46342 pw=0 time=5957447
( Awesome, it processed 10M records in 83 secs )
Now, just changing to query to "find the sales details for all customers who made Transactions in Q3 2005 ", causes the plan to change like this.
rajesh@10GR2> SELECT c.* , s.*
2 from customers c, sales s
3 where c.cust_id = s.cust_id
4 and c.cust_date = s.cust_date
5 and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
6 and to_date('30-sep-2005','dd-mon-yyyy')
7 /
Elapsed: 00:00:00.03
Execution Plan
----------------------------------------------------------
Plan hash value: 3606662981
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Pstart| Pstop |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 801K| 34547 (2)| | |
| 1 | PARTITION HASH ALL | | 801K| 34547 (2)| 1 | 4 ||* 2 | HASH JOIN | | 801K| 34547 (2)| | |
| 3 | PARTITION RANGE SINGLE| | 929K| 1875 (2)| 3 | 3 |
|* 4 | TABLE ACCESS FULL | CUSTOMERS | 929K| 1875 (2)| 9 | 12 |
| 5 | PARTITION RANGE SINGLE| | 10M| 10510 (3)| 3 | 3 |
|* 6 | TABLE ACCESS FULL | SALES | 10M| 10510 (3)| 9 | 12 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("C"."CUST_ID"="S"."CUST_ID" AND "C"."CUST_DATE"="S"."CUST_DATE")
4 - filter("C"."CUST_DATE"<=TO_DATE('2005-09-30 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
6 - filter("S"."CUST_DATE"<=TO_DATE('2005-09-30 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
rajesh@10GR2>
And the Tkprof show me,
SELECT c.* , s.*
from customers c, sales s
where c.cust_id = s.cust_id
and c.cust_date = s.cust_date
and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
and to_date('30-sep-2005','dd-mon-yyyy')
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 7205 14.59 63.64 46889 61643 0 1080530
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 7207 14.59 63.65 46889 61643 0 1080530
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100
Rows Row Source Operation
------- ---------------------------------------------------
1080530 PARTITION HASH ALL PARTITION: 1 4 (cr=61643 pr=46889 pw=775 time=13521195 us)
1080530 HASH JOIN (cr=61643 pr=46889 pw=775 time=13366665 us)
884070 PARTITION RANGE SINGLE PARTITION: 3 3 (cr=7992 pr=0 pw=0 time=1768864 us)
884070 TABLE ACCESS FULL CUSTOMERS PARTITION: 9 12 (cr=7992 pr=0 pw=0 time=884703 us)
10265035 PARTITION RANGE SINGLE PARTITION: 3 3 (cr=53651 pr=46114 pw=0 time=61625325 us)
10265035 TABLE ACCESS FULL SALES PARTITION: 9 12 (cr=53651 pr=46114 pw=0 time=41095102
Full partition-wise join will takes place only if you, equipartition both tables on their join keys. So, what will happen if you don't equipartition both tables.
scott@10GR2> create table customers
2 (
3 CUST_ID ,
4 CUST_FIRST_NAME ,
5 CUST_LAST_NAME ,
6 CUST_GENDER ,
7 CUST_DATE ,
8 CUST_POSTAL_CODE ,
9 CUST_CITY ,
10 CUST_TOTAL
11 )
12 partition by hash(cust_id) partitions 8
13 nologging
14 as
15 select CUST_ID ,
16 CUST_FIRST_NAME ,
17 CUST_LAST_NAME ,
18 CUST_GENDER ,
19 CUST_DATE ,
20 CUST_POSTAL_CODE ,
21 CUST_CITY ,
22 CUST_TOTAL
23 from customers_data
24 where 1=0;
Table created.
Elapsed: 00:00:00.25
scott@10GR2>
scott@10GR2>
scott@10GR2> create table sales
2 (
3 SALES_ID ,
4 CUST_ID ,
5 QUANTITY_SOLD ,
6 AMOUNT_SOLD ,
7 CUST_DATE
8 )
9 partition by hash(SALES_ID) partitions 8
10 nologging
11 as
12 select SALES_ID ,
13 CUST_ID ,
14 QUANTITY_SOLD ,
15 AMOUNT_SOLD ,
16 CUST_DATE
17 from sales_data
18 where 1=0;
Table created.
Elapsed: 00:00:00.21
scott@10GR2>
Now, Customer table is HASH partitioned on customer_id and Sales table is HASH partitioned on sales_id columns and now the Plan changes like below.
scott@10GR2>
scott@10GR2> begin
2 dbms_stats.set_table_stats(
3 ownname=>user,
4 tabname=>'CUSTOMERS',
5 numrows => 1026072,
6 numblks => 1000000 );
7
8 dbms_stats.set_table_stats(
9 ownname=>user,
10 tabname=>'SALES',
11 numrows => 11863049,
12 numblks => 1000000 );
13 end;
14 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.10
scott@10GR2>
scott@10GR2> set autotrace traceonly explain;
scott@10GR2>
scott@10GR2> SELECT c.* , s.*
2 from customers c, sales s
3 where c.cust_id = s.cust_id
4 and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
5 and to_date('30-sep-2005','dd-mon-yyyy');
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 742433711
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Pstart| Pstop |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 29658 | 438K (1)| | |
|* 1 | HASH JOIN | | 29658 | 438K (1)| | |
| 2 | PARTITION HASH ALL| | 2565 | 219K (1)| 1 | 8 |
|* 3 | TABLE ACCESS FULL| CUSTOMERS | 2565 | 219K (1)| 1 | 8 |
| 4 | PARTITION HASH ALL| | 11M| 219K (1)| 1 | 8 |
| 5 | TABLE ACCESS FULL| SALES | 11M| 219K (1)| 1 | 8 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("C"."CUST_ID"="S"."CUST_ID")
3 - filter("C"."CUST_DATE">=TO_DATE('2005-07-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss')
AND "C"."CUST_DATE"<=TO_DATE('2005-09-30 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
scott@10GR2>
You can see the Full partitioned wise join doesn't happen now ( PARTITION HASH ALL access path missing above HASH JOIN in plans ).
These kinds of large join is typical in data warehousing environments, when data-model is designed properly Full Partition Wise Joins will really be help full.
SELECT c.* , s.*
from customers c, sales s
where c.cust_id = s.cust_id
and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
and to_date('30-sep-2005','dd-mon-yyyy');
If we equipartition both customers and sales table based on this join keys like below.
rajesh@10GR2> create table customers
2 (
3 CUST_ID ,
4 CUST_FIRST_NAME ,
5 CUST_LAST_NAME ,
6 CUST_GENDER ,
7 CUST_DATE ,
8 CUST_POSTAL_CODE ,
9 CUST_CITY ,
10 CUST_TOTAL
11 )
12 partition by range(CUST_DATE)
13 subpartition by hash(cust_id)
14 subpartition template
15 (
16 subpartition sp1,
17 subpartition sp2,
18 subpartition sp3,
19 subpartition sp4
20 )
21 (
22 partition p_q1_2005 values less than ( to_date('01-apr-2005','dd-mon-yyyy') ),
23 partition p_q2_2005 values less than ( to_date('01-jul-2005','dd-mon-yyyy') ),
24 partition p_q3_2005 values less than ( to_date('01-oct-2005','dd-mon-yyyy') ),
25 partition p_q4_2005 values less than ( to_date('01-jan-2006','dd-mon-yyyy') ),
26 partition p_q1_2006 values less than ( to_date('01-apr-2006','dd-mon-yyyy') ),
27 partition p_q2_2006 values less than ( to_date('01-jul-2006','dd-mon-yyyy') ),
28 partition p_q3_2006 values less than ( to_date('01-oct-2006','dd-mon-yyyy') ),
29 partition p_q4_2006 values less than ( to_date('01-jan-2007','dd-mon-yyyy') ),
30 partition p_q1_2007 values less than ( to_date('01-apr-2007','dd-mon-yyyy') ),
31 partition p_q2_2007 values less than ( to_date('01-jul-2007','dd-mon-yyyy') ),
32 partition p_q3_2007 values less than ( to_date('01-oct-2007','dd-mon-yyyy') ),
33 partition p_q4_2007 values less than ( to_date('01-jan-2008','dd-mon-yyyy') ),
34 partition p_q1_2008 values less than ( to_date('01-apr-2008','dd-mon-yyyy') ),
35 partition p_q2_2008 values less than ( to_date('01-jul-2008','dd-mon-yyyy') ),
36 partition p_q3_2008 values less than ( to_date('01-oct-2008','dd-mon-yyyy') ),
37 partition p_q4_2008 values less than ( to_date('01-jan-2009','dd-mon-yyyy') ),
38 partition p_q1_2009 values less than ( to_date('01-apr-2009','dd-mon-yyyy') ),
39 partition p_q2_2009 values less than ( to_date('01-jul-2009','dd-mon-yyyy') ),
40 partition p_q3_2009 values less than ( to_date('01-oct-2009','dd-mon-yyyy') ),
41 partition p_q4_2009 values less than ( to_date('01-jan-2010','dd-mon-yyyy') ),
42 partition p_q1_2010 values less than ( to_date('01-apr-2010','dd-mon-yyyy') ),
43 partition p_q2_2010 values less than ( to_date('01-jul-2010','dd-mon-yyyy') ),
44 partition p_q3_2010 values less than ( to_date('01-oct-2010','dd-mon-yyyy') ),
45 partition p_q4_2010 values less than ( to_date('01-jan-2011','dd-mon-yyyy') ),
46 partition p_q1_2011 values less than ( to_date('01-apr-2011','dd-mon-yyyy') ),
47 partition p_q2_2011 values less than ( to_date('01-jul-2011','dd-mon-yyyy') ),
48 partition p_q3_2011 values less than ( to_date('01-oct-2011','dd-mon-yyyy') ),
49 partition p_q4_2011 values less than ( to_date('01-jan-2012','dd-mon-yyyy') )
50 )
51 nologging
52 as
53 select CUST_ID ,
54 CUST_FIRST_NAME ,
55 CUST_LAST_NAME ,
56 CUST_GENDER ,
57 CUST_DATE ,
58 CUST_POSTAL_CODE ,
59 CUST_CITY ,
60 CUST_TOTAL
61 from customers_data;
Table created.
Elapsed: 00:00:16.24
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> exec dbms_stats.gather_table_stats(user,'CUSTOMERS');
PL/SQL procedure successfully completed.
Elapsed: 00:00:08.64
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> create table sales
2 (
3 SALES_ID ,
4 CUST_ID ,
5 QUANTITY_SOLD ,
6 AMOUNT_SOLD ,
7 CUST_DATE
8 )
9 partition by range(CUST_DATE)
10 subpartition by hash(cust_id)
11 subpartition template
12 (
13 subpartition sp1,
14 subpartition sp2,
15 subpartition sp3,
16 subpartition sp4
17 )
18 (
19 partition p_q1_2005 values less than ( to_date('01-apr-2005','dd-mon-yyyy') ),
20 partition p_q2_2005 values less than ( to_date('01-jul-2005','dd-mon-yyyy') ),
21 partition p_q3_2005 values less than ( to_date('01-oct-2005','dd-mon-yyyy') ),
22 partition p_q4_2005 values less than ( to_date('01-jan-2006','dd-mon-yyyy') ),
23 partition p_q1_2006 values less than ( to_date('01-apr-2006','dd-mon-yyyy') ),
24 partition p_q2_2006 values less than ( to_date('01-jul-2006','dd-mon-yyyy') ),
25 partition p_q3_2006 values less than ( to_date('01-oct-2006','dd-mon-yyyy') ),
26 partition p_q4_2006 values less than ( to_date('01-jan-2007','dd-mon-yyyy') ),
27 partition p_q1_2007 values less than ( to_date('01-apr-2007','dd-mon-yyyy') ),
28 partition p_q2_2007 values less than ( to_date('01-jul-2007','dd-mon-yyyy') ),
29 partition p_q3_2007 values less than ( to_date('01-oct-2007','dd-mon-yyyy') ),
30 partition p_q4_2007 values less than ( to_date('01-jan-2008','dd-mon-yyyy') ),
31 partition p_q1_2008 values less than ( to_date('01-apr-2008','dd-mon-yyyy') ),
32 partition p_q2_2008 values less than ( to_date('01-jul-2008','dd-mon-yyyy') ),
33 partition p_q3_2008 values less than ( to_date('01-oct-2008','dd-mon-yyyy') ),
34 partition p_q4_2008 values less than ( to_date('01-jan-2009','dd-mon-yyyy') ),
35 partition p_q1_2009 values less than ( to_date('01-apr-2009','dd-mon-yyyy') ),
36 partition p_q2_2009 values less than ( to_date('01-jul-2009','dd-mon-yyyy') ),
37 partition p_q3_2009 values less than ( to_date('01-oct-2009','dd-mon-yyyy') ),
38 partition p_q4_2009 values less than ( to_date('01-jan-2010','dd-mon-yyyy') ),
39 partition p_q1_2010 values less than ( to_date('01-apr-2010','dd-mon-yyyy') ),
40 partition p_q2_2010 values less than ( to_date('01-jul-2010','dd-mon-yyyy') ),
41 partition p_q3_2010 values less than ( to_date('01-oct-2010','dd-mon-yyyy') ),
42 partition p_q4_2010 values less than ( to_date('01-jan-2011','dd-mon-yyyy') ),
43 partition p_q1_2011 values less than ( to_date('01-apr-2011','dd-mon-yyyy') ),
44 partition p_q2_2011 values less than ( to_date('01-jul-2011','dd-mon-yyyy') ),
45 partition p_q3_2011 values less than ( to_date('01-oct-2011','dd-mon-yyyy') ),
46 partition p_q4_2011 values less than ( to_date('01-jan-2012','dd-mon-yyyy') )
47 )
48 nologging
49 as
50 select SALES_ID ,
51 CUST_ID ,
52 QUANTITY_SOLD ,
53 AMOUNT_SOLD ,
54 CUST_DATE
55 from sales_data;
Table created.
Elapsed: 00:07:03.90
rajesh@10GR2> alter table customers add constraint cust_pk primary key(cust_id);
Table altered.
Elapsed: 00:00:11.01
rajesh@10GR2> alter table sales add constraint sales_fk foreign key(cust_id) references customers;
Table altered.
Elapsed: 00:00:19.17
rajesh@10GR2> select count(*) from sales;
COUNT(*)
----------
11863049
Elapsed: 00:01:12.48
rajesh@10GR2> select count(*) from customers;
COUNT(*)
----------
1026072
Elapsed: 00:00:00.71
rajesh@10GR2>
rajesh@10GR2> SELECT c.* , s.*
2 from customers c, sales s
3 where c.cust_id = s.cust_id
4 and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
5 and to_date('30-sep-2005','dd-mon-yyyy');
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 3704956898
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Pstart| Pstop |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11M| 39198 (2)| | |
| 1 | PARTITION HASH ALL | | 11M| 39198 (2)| 1 | 4 |
|* 2 | HASH JOIN | | 11M| 39198 (2)| | |
| 3 | PARTITION RANGE SINGLE| | 929K| 1875 (2)| 3 | 3 |
|* 4 | TABLE ACCESS FULL | CUSTOMERS | 929K| 1875 (2)| 9 | 12 |
| 5 | PARTITION RANGE ALL | | 11M| 12125 (2)| 1 | 28 |
| 6 | TABLE ACCESS FULL | SALES | 11M| 12125 (2)| 1 | 112 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("C"."CUST_ID"="S"."CUST_ID")
4 - filter("C"."CUST_DATE"<=TO_DATE('2005-09-30 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
rajesh@10GR2>
- All HASH Partition in SALES table is scanned (this is shown in plan as Pstart = 1 and Pstop = 112 at Id=6)
- Only 4 Hash Partition in CUSTOMERS table corresponding to 2005, Q3 is scanned (this is shown in plan as Pstart = 9 and Pstop = 12 at Id=4)
- the Access path PARTITION HASH ALL from plan says, only the 4 Hash Partition from CUSTOMERS table is joined (Full Partition-Wise Joins) with 4 Hash Partition from sales table (this is shown in plan as Pstart = 1 and Pstop = 4 at Id=1)
SELECT c.* , s.*
from customers c, sales s
where c.cust_id = s.cust_id
and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
and to_date('30-sep-2005','dd-mon-yyyy')
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 68435 52.31 83.57 47086 130237 0 10265035
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 68437 52.31 83.59 47086 130237 0 10265035
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100
Rows Row Source Operation
------- ---------------------------------------------------
10265035 PARTITION HASH ALL PARTITION: 1 4 (cr=130237 pr=47086 pw=744 time=64025850 us)
10265035 HASH JOIN (cr=130237 pr=47086 pw=744 time=54649725 us)
884070 PARTITION RANGE SINGLE PARTITION: 3 3 (cr=8054 pr=0 pw=0 time=1768764 us)
884070 TABLE ACCESS FULL CUSTOMERS PARTITION: 9 12 (cr=8054 pr=0 pw=0 time=884626 us)
11863049 PARTITION RANGE ALL PARTITION: 1 28 (cr=122183 pr=46342 pw=0 time=71307400 us)
11863049 TABLE ACCESS FULL SALES PARTITION: 1 112 (cr=122183 pr=46342 pw=0 time=5957447
( Awesome, it processed 10M records in 83 secs )
Now, just changing to query to "find the sales details for all customers who made Transactions in Q3 2005 ", causes the plan to change like this.
rajesh@10GR2> SELECT c.* , s.*
2 from customers c, sales s
3 where c.cust_id = s.cust_id
4 and c.cust_date = s.cust_date
5 and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
6 and to_date('30-sep-2005','dd-mon-yyyy')
7 /
Elapsed: 00:00:00.03
Execution Plan
----------------------------------------------------------
Plan hash value: 3606662981
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Pstart| Pstop |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 801K| 34547 (2)| | |
| 1 | PARTITION HASH ALL | | 801K| 34547 (2)| 1 | 4 ||* 2 | HASH JOIN | | 801K| 34547 (2)| | |
| 3 | PARTITION RANGE SINGLE| | 929K| 1875 (2)| 3 | 3 |
|* 4 | TABLE ACCESS FULL | CUSTOMERS | 929K| 1875 (2)| 9 | 12 |
| 5 | PARTITION RANGE SINGLE| | 10M| 10510 (3)| 3 | 3 |
|* 6 | TABLE ACCESS FULL | SALES | 10M| 10510 (3)| 9 | 12 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("C"."CUST_ID"="S"."CUST_ID" AND "C"."CUST_DATE"="S"."CUST_DATE")
4 - filter("C"."CUST_DATE"<=TO_DATE('2005-09-30 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
6 - filter("S"."CUST_DATE"<=TO_DATE('2005-09-30 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
rajesh@10GR2>
And the Tkprof show me,
SELECT c.* , s.*
from customers c, sales s
where c.cust_id = s.cust_id
and c.cust_date = s.cust_date
and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
and to_date('30-sep-2005','dd-mon-yyyy')
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 7205 14.59 63.64 46889 61643 0 1080530
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 7207 14.59 63.65 46889 61643 0 1080530
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100
Rows Row Source Operation
------- ---------------------------------------------------
1080530 PARTITION HASH ALL PARTITION: 1 4 (cr=61643 pr=46889 pw=775 time=13521195 us)
1080530 HASH JOIN (cr=61643 pr=46889 pw=775 time=13366665 us)
884070 PARTITION RANGE SINGLE PARTITION: 3 3 (cr=7992 pr=0 pw=0 time=1768864 us)
884070 TABLE ACCESS FULL CUSTOMERS PARTITION: 9 12 (cr=7992 pr=0 pw=0 time=884703 us)
10265035 PARTITION RANGE SINGLE PARTITION: 3 3 (cr=53651 pr=46114 pw=0 time=61625325 us)
10265035 TABLE ACCESS FULL SALES PARTITION: 9 12 (cr=53651 pr=46114 pw=0 time=41095102
Full partition-wise join will takes place only if you, equipartition both tables on their join keys. So, what will happen if you don't equipartition both tables.
scott@10GR2> create table customers
2 (
3 CUST_ID ,
4 CUST_FIRST_NAME ,
5 CUST_LAST_NAME ,
6 CUST_GENDER ,
7 CUST_DATE ,
8 CUST_POSTAL_CODE ,
9 CUST_CITY ,
10 CUST_TOTAL
11 )
12 partition by hash(cust_id) partitions 8
13 nologging
14 as
15 select CUST_ID ,
16 CUST_FIRST_NAME ,
17 CUST_LAST_NAME ,
18 CUST_GENDER ,
19 CUST_DATE ,
20 CUST_POSTAL_CODE ,
21 CUST_CITY ,
22 CUST_TOTAL
23 from customers_data
24 where 1=0;
Table created.
Elapsed: 00:00:00.25
scott@10GR2>
scott@10GR2>
scott@10GR2> create table sales
2 (
3 SALES_ID ,
4 CUST_ID ,
5 QUANTITY_SOLD ,
6 AMOUNT_SOLD ,
7 CUST_DATE
8 )
9 partition by hash(SALES_ID) partitions 8
10 nologging
11 as
12 select SALES_ID ,
13 CUST_ID ,
14 QUANTITY_SOLD ,
15 AMOUNT_SOLD ,
16 CUST_DATE
17 from sales_data
18 where 1=0;
Table created.
Elapsed: 00:00:00.21
scott@10GR2>
Now, Customer table is HASH partitioned on customer_id and Sales table is HASH partitioned on sales_id columns and now the Plan changes like below.
scott@10GR2>
scott@10GR2> begin
2 dbms_stats.set_table_stats(
3 ownname=>user,
4 tabname=>'CUSTOMERS',
5 numrows => 1026072,
6 numblks => 1000000 );
7
8 dbms_stats.set_table_stats(
9 ownname=>user,
10 tabname=>'SALES',
11 numrows => 11863049,
12 numblks => 1000000 );
13 end;
14 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.10
scott@10GR2>
scott@10GR2> set autotrace traceonly explain;
scott@10GR2>
scott@10GR2> SELECT c.* , s.*
2 from customers c, sales s
3 where c.cust_id = s.cust_id
4 and c.cust_date between to_date('01-jul-2005','dd-mon-yyyy')
5 and to_date('30-sep-2005','dd-mon-yyyy');
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 742433711
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Pstart| Pstop |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 29658 | 438K (1)| | |
|* 1 | HASH JOIN | | 29658 | 438K (1)| | |
| 2 | PARTITION HASH ALL| | 2565 | 219K (1)| 1 | 8 |
|* 3 | TABLE ACCESS FULL| CUSTOMERS | 2565 | 219K (1)| 1 | 8 |
| 4 | PARTITION HASH ALL| | 11M| 219K (1)| 1 | 8 |
| 5 | TABLE ACCESS FULL| SALES | 11M| 219K (1)| 1 | 8 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("C"."CUST_ID"="S"."CUST_ID")
3 - filter("C"."CUST_DATE">=TO_DATE('2005-07-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss')
AND "C"."CUST_DATE"<=TO_DATE('2005-09-30 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
scott@10GR2>
You can see the Full partitioned wise join doesn't happen now ( PARTITION HASH ALL access path missing above HASH JOIN in plans ).
These kinds of large join is typical in data warehousing environments, when data-model is designed properly Full Partition Wise Joins will really be help full.
No comments:
Post a Comment