Tuesday, October 26, 2010

Sorted Hash Clusters

Sorted hash clusters are new data structures available from Oracle 10g Release 1 that enable faster retrieval of data for applications where data is consumed in the order in which it was inserted " using sorted hash clusters will skip the sorting step on your data when the order by in your query matches the sort in the hash cluster ".

rajesh@10GR2> create cluster t_clust(
  2     x number,
  3     y number sort,
  4     z number sort
  5  )HASHKEYS 100 HASH is x;

Cluster created.

Elapsed: 00:00:00.53
rajesh@10GR2>
rajesh@10GR2> create table t (
  2     c1 number,
  3     c2 number sort,
  4     c3 number sort,
  5     dt date,
  6     usr varchar2(10)
  7  )cluster t_clust(c1,c2,c3);

Table created.

Elapsed: 00:00:00.10
rajesh@10GR2>
rajesh@10GR2> insert /*+ append */ into t
  2  select     mod(rownum,100),
  3             rownum ,
  4             rownum,
  5             sysdate,
  6             user
  7  from dual
  8  connect by level <= 500000;

500000 rows created.

Elapsed: 00:00:49.54
rajesh@10GR2> commit;

Commit complete.

Elapsed: 00:00:00.06
rajesh@10GR2>

created a Sorted hash cluster and inserted data. Now, i will create a Heap organized table with data available cluster table. 

rajesh@10GR2> create table t_heap as select * from t;

Table created.

Elapsed: 00:00:07.29
rajesh@10GR2> begin
  2     dbms_stats.gather_table_stats(user,'T');
  3     dbms_stats.gather_table_stats(user,'T_HEAP');
  4  end;
  5  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:08.43
rajesh@10GR2>

Now, looking at the Tkprof it show the below.

********************************************************************************
select *
from t
where c1 = 57
order by c2,c3

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          2          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       35      0.03       0.00          0         62          0        5000
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       37      0.03       0.01          0         64          0        5000

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100 

Rows     Row Source Operation
-------  ---------------------------------------------------
   5000  TABLE ACCESS HASH T (cr=62 pr=0 pw=0 time=5056 us)
********************************************************************************
select *
from t_heap
where c1 = 57
order by c2,c3

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       35      0.03       0.04          0       2296          0        5000
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       37      0.03       0.04          0       2296          0        5000

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 100 

Rows     Row Source Operation
-------  ---------------------------------------------------
   5000  SORT ORDER BY (cr=2296 pr=0 pw=0 time=38697 us)
   5000   TABLE ACCESS FULL T_HEAP (cr=2296 pr=0 pw=0 time=6498 us)
********************************************************************************

look at the elimination of " Sort Order by " Access path  in Explain plan for query using Sorted hash clusters. so using sorted hash clusters will skip the sorting step on your data when the order by in your query matches the sort in the hash cluster.

No comments:

Post a Comment