Tuesday, January 12, 2016

Amazing Optimization to get distinct and TopN values from index - Part II

This will be extension of previous post, in case if you have not gone through that, please read that before to proceed here.
Say we need to get TopN entries for each owner. That is for each owner, I need to get the top-five maximum object_id’s along with its object_name.
 
The standard query using Analytics would be this way.
 
rajesh@ORA11G> select owner,object_id,object_name
  2  from (
  3  select object_id,owner,object_name,
  4    row_number() over(partition by owner order by object_id desc) rn
  5  from t
  6       )
  7  where rn <= 5
  8  order by owner,object_id ;
 
148 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3060612387
 
-------------------------------------------------------------------------------
| Id  | Operation                 | Name | Rows  | Bytes |TempSpc| Cost (%CPU)|
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |      | 84718 |  4963K|       |  3185   (3)|
|   1 |  SORT ORDER BY            |      | 84718 |  4963K|  6008K|  3185   (3)|
|*  2 |   VIEW                    |      | 84718 |  4963K|       |  1536   (3)|
|*  3 |    WINDOW SORT PUSHED RANK|      | 84718 |  2316K|  3336K|  1536   (3)|
|   4 |     TABLE ACCESS FULL     | T    | 84718 |  2316K|       |   632   (2)|
-------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("RN"<=5)
   3 - filter(ROW_NUMBER() OVER ( PARTITION BY "OWNER" ORDER BY
              INTERNAL_FUNCTION("OBJECT_ID") DESC )<=5)
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1214  consistent gets
          0  physical reads
          0  redo size
       4696  bytes sent via SQL*Net to client
        500  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
        148  rows processed
 
rajesh@ORA11G> set autotrace off
 
The actual plan from library cache would be this
 
rajesh@ORA11G> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
 
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  ca0u3qvm5yujf, child number 1
-------------------------------------
select owner,object_id,object_name from ( select
object_id,owner,object_name,   row_number() over(partition by owner
order by object_id desc) rn from t      ) where rn <= 5 order by
owner,object_id
 
Plan hash value: 3060612387
 
--------------------------------------------------------------------------------------------
| Id  | Operation                 | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |      |      1 |        |    148 |00:00:00.15 |    1214 |
|   1 |  SORT ORDER BY            |      |      1 |  84718 |    148 |00:00:00.15 |    1214 |
|*  2 |   VIEW                    |      |      1 |  84718 |    148 |00:00:00.15 |    1214 |
|*  3 |    WINDOW SORT PUSHED RANK|      |      1 |  84718 |    176 |00:00:00.15 |    1214 |
|   4 |     TABLE ACCESS FULL     | T    |      1 |  84718 |  84718 |00:00:00.02 |    1214 |
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("RN"<=5)
   3 - filter(ROW_NUMBER() OVER ( PARTITION BY "OWNER" ORDER BY INTERNAL_FUNCTION("OBJECT_ID") DESC )<=5)
 
 
25 rows selected.
 
rajesh@ORA11G>
 
Here is the explanation of the process
 
  • Full scan the table “T”
  • Logically partition them into subsets based on the values from OWNER column.
  • Within each partition, sort the rows based on the object_id in Descending order, till we find five values per partition.
  • Finally sort the result based on “order by” clause of sql statement, to produce the result set.
 
 
With the new modified approach it would be like this.
 
rajesh@ORA11G> with datas(x) as
  2  (
  3     select min(owner) from t
  4     union all
  5     select (select min(owner) from t t2
  6                             where t2.owner > datas.x )
  7     from datas
  8     where x is not null )
  9  select x, to_number(substr(column_value,1,6)) as object_id,
 10        substr(column_value,7) as object_name
 11  from datas d1,
 12     table(cast( multiset( select /*+ index_desc(t,t_idx) */
 13          to_char(object_id,'fm000000')||object_name txt
 14     from t
 15     where t.owner = d1.x
 16     and rownum <=5 ) as sys.odcivarchar2list))
 17  where x is not null
 18  order by x,object_id ;
 
148 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 254218082
 
---------------------------------------------------------------------------------
| Id  | Operation                                   | Name  | Rows  |Cost (%CPU)|
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |       | 16336 |  240   (3)|
|   1 |  SORT ORDER BY                              |       | 16336 |  240   (3)|
|   2 |   NESTED LOOPS                              |       | 16336 |  107   (2)|
|*  3 |    VIEW                                     |       |     2 |    4   (0)|
|   4 |     UNION ALL (RECURSIVE WITH) BREADTH FIRST|       |       |           |
|   5 |      SORT AGGREGATE                         |       |     1 |           |
|   6 |       INDEX FULL SCAN (MIN/MAX)             | T_IDX |     1 |    2   (0)|
|   7 |      SORT AGGREGATE                         |       |     1 |           |
|   8 |       FIRST ROW                             |       |     1 |    2   (0)|
|*  9 |        INDEX RANGE SCAN (MIN/MAX)           | T_IDX |     1 |    2   (0)|
|* 10 |      RECURSIVE WITH PUMP                    |       |       |           |
|  11 |    COLLECTION ITERATOR SUBQUERY FETCH       |       |  8168 |   51   (0)|
|* 12 |     COUNT STOPKEY                           |       |       |           |
|  13 |      TABLE ACCESS BY INDEX ROWID            | T     |  2733 |   87   (2)|
|* 14 |       INDEX RANGE SCAN DESCENDING           | T_IDX |  2733 |   10   (0)|
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter("X" IS NOT NULL)
   9 - access("T2"."OWNER">:B1)
  10 - filter("X" IS NOT NULL)
  12 - filter(ROWNUM<=5)
  14 - access("T"."OWNER"=:B1)
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        142  consistent gets
          0  physical reads
          0  redo size
       4692  bytes sent via SQL*Net to client
        500  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         34  sorts (memory)
          0  sorts (disk)
        148  rows processed
 
rajesh@ORA11G> set autotrace off
rajesh@ORA11G> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
 
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  7axgncuqtg1r2, child number 0
-------------------------------------
with datas(x) as (  select min(owner) from t  union all  select (select
min(owner) from t t2     where t2.owner > datas.x )  from datas  where
x is not null ) select x, to_number(substr(column_value,1,6)) as
object_id,       substr(column_value,7) as object_name from datas d1,
 table(cast( multiset( select /*+ index_desc(t,t_idx) */
to_char(object_id,'fm000000')||object_name txt    from t    where
t.owner = d1.x    and rownum <=5 ) as sys.odcivarchar2list)) where x is
not null order by x,object_id
 
Plan hash value: 254218082
 
-------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name  | Starts | E-Rows | A-Rows |Buffers |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |       |      1 |        |    148 |    142 |
|   1 |  SORT ORDER BY                              |       |      1 |  16336 |    148 |    142 |
|   2 |   NESTED LOOPS                              |       |      1 |  16336 |    148 |    142 |
|*  3 |    VIEW                                     |       |      1 |      2 |     31 |     35 |
|   4 |     UNION ALL (RECURSIVE WITH) BREADTH FIRST|       |      1 |        |     32 |     35 |
|   5 |      SORT AGGREGATE                         |       |      1 |      1 |      1 |      2 |
|   6 |       INDEX FULL SCAN (MIN/MAX)             | T_IDX |      1 |      1 |      1 |      2 |
|   7 |      SORT AGGREGATE                         |       |     31 |      1 |     31 |     33 |
|   8 |       FIRST ROW                             |       |     31 |      1 |     30 |     33 |
|*  9 |        INDEX RANGE SCAN (MIN/MAX)           | T_IDX |     31 |      1 |     30 |     33 |
|  10 |      RECURSIVE WITH PUMP                    |       |     32 |        |     31 |      0 |
|  11 |    COLLECTION ITERATOR SUBQUERY FETCH       |       |     31 |   8168 |    148 |    107 |
|* 12 |     COUNT STOPKEY                           |       |     31 |        |    148 |    107 |
|  13 |      TABLE ACCESS BY INDEX ROWID            | T     |     31 |   2733 |    148 |    107 |
|* 14 |       INDEX RANGE SCAN DESCENDING           | T_IDX |     31 |   2733 |    148 |     63 |
-------------------------------------------------------------------------------------------------   
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter("X" IS NOT NULL)
   9 - access("T2"."OWNER">:B1)
  12 - filter(ROWNUM<=5)
  14 - access("T"."OWNER"=:B1)
 
 
41 rows selected.
 
rajesh@ORA11G>
 
The recursive “With” clause looks very promising compared to Analytics approach.
 
But looking into deep one might think that below portion of code “not” returns the Top five maximum object_ids, instead it returns the random five records per owner.
 
  12     table(cast( multiset( select /*+ index_desc(t,t_idx) */
 13          to_char(object_id,'fm000000')||object_name txt
 14     from t
 15     where t.owner = d1.x
 16     and rownum <=5 )
 
The query only selects values from the T_IDX index (owner,object_id and rowid). So Oracle can do an index range scan descending. Walking along the index like this will return the values in (descending) index order. Oracle will then stop when it has read 5 of these and then access the table using those rowid’s to get “object_name” values.
 
If Oracle chooses a FTS, then yes, the results will be random. To guard against this I have included the index_desc hint:
 
rajesh@ORA11G>
rajesh@ORA11G> select /*+ full(t) */ owner,object_id
  2  from t
  3  where owner ='SYS'
  4  and rownum <=5 ;
 
OWNER       OBJECT_ID
---------- ----------
SYS                20
SYS                46
SYS                28
SYS                15
SYS                29
 
5 rows selected.
 
rajesh@ORA11G> select /*+ index_desc(t,t_idx) */ owner,object_id
  2  from t
  3  where owner ='SYS'
  4  and rownum <=5 ;
 
OWNER       OBJECT_ID
---------- ----------
SYS             92922
SYS             92921
SYS             92920
SYS             92730
SYS             92729
 
5 rows selected.
 
rajesh@ORA11G>

Values from the sql using index_desc hint matches exactly with analytics output.
 
rajesh@ORA11G>
rajesh@ORA11G> select owner,object_id
  2  from (
  3  select owner,object_id,
  4    row_number() over(order by object_id desc) rn
  5  from t
  6  where owner ='SYS'
  7       )
  8  where rn <= 5
  9  /
 
OWNER       OBJECT_ID
---------- ----------
SYS             92922
SYS             92921
SYS             92920
SYS             92730
SYS             92729
 
5 rows selected.
 
rajesh@ORA11G>

No comments:

Post a Comment