Showing posts with label Grouping Sets. Show all posts
Showing posts with label Grouping Sets. Show all posts

Thursday, November 18, 2010

Grouping Sets

Learnt something newly today, its about specifying the set of groups that you want to create using Grouping sets.

rajesh@10GR2> select x,y,z
  2  from t1
  3  order by 1;

         X          Y          Z
---------- ---------- ----------
         1          2          3
         2          2         10
         3          5          5

Elapsed: 00:00:00.01
rajesh@10GR2>
rajesh@10GR2> select sum(x),sum(y),sum(z)
  2  from t1
  3  order by 1;

    SUM(X)     SUM(Y)     SUM(Z)
---------- ---------- ----------
         6          9         18

Elapsed: 00:00:00.01

So one way to get the aggregated data's along with all the actual values is.

rajesh@10GR2>
rajesh@10GR2> select x,y,z
  2  from t1
  3  union all
  4  select sum(x),sum(y),sum(z)
  5  from t1
  6  order by 1
  7  /

         X          Y          Z
---------- ---------- ----------
         1          2          3
         2          2         10
         3          5          5
         6          9         18

Elapsed: 00:00:00.29
rajesh@10GR2>

The other is using super-aggregate function using CUBE and filtering out the needed aggregates like below.

rajesh@10GR2> select sum(x),sum(y),sum(z)
  2  from t1
  3  group by cube(z,y,x)
  4  having grouping(x)+grouping(z)+grouping(y) = 0 or
  5         grouping(x)+grouping(z)+grouping(y) = 3
  6  order by 1
  7  /

    SUM(X)     SUM(Y)     SUM(Z)
---------- ---------- ----------
         1          2          3
         2          2         10
         3          5          5
         6          9         18

Elapsed: 00:00:00.00
rajesh@10GR2>

But the coolest way is to use Groping sets and specify only the needed set of groups.

rajesh@10GR2> select sum(x),sum(y),sum(z)
  2  from t1
  3  group by grouping sets( (x,y,z),() )
  4  order by 1
  5  /

    SUM(X)     SUM(Y)     SUM(Z)
---------- ---------- ----------
         1          2          3
         2          2         10
         3          5          5
         6          9         18

Elapsed: 00:00:00.01
rajesh@10GR2>

Now lets benchmark this three stuffs on high data volume sets.

rajesh@10GR2> insert into t1(x,y,z)
  2  select mod(level,10),level,level
  3  from dual
  4  connect by level < = 1000000
  5  /

1000000 rows created.

Elapsed: 00:00:03.39
rajesh@10GR2> commit;

Commit complete.

Elapsed: 00:00:00.23
rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> exec dbms_stats.gather_table_stats(user,'T1');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.82
rajesh@10GR2>
rajesh@10GR2> set autotrace traceonly explain statistics;
rajesh@10GR2>
rajesh@10GR2> select x,y,z
  2  from t1
  3  union all
  4  select sum(x),sum(y),sum(z)
  5  from t1
  6  order by 1
  7  /

1000001 rows selected.

Elapsed: 00:00:28.71

Execution Plan
----------------------------------------------------------
Plan hash value: 1154063651

-------------------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  1016K|    37M|       | 11365   (1)| 00:02:17 |
|   1 |  SORT ORDER BY       |      |  1016K|    37M|   101M|  1113  (52)| 00:00:14 |
|   2 |   UNION-ALL          |      |       |       |       |            |          |
|   3 |    TABLE ACCESS FULL | T1   |  1016K|    37M|       |   557   (3)| 00:00:07 |
|   4 |    SORT AGGREGATE    |      |     1 |    39 |       |            |          |
|   5 |     TABLE ACCESS FULL| T1   |  1016K|    37M|       |   557   (3)| 00:00:07 |
-------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       4966  consistent gets
          0  physical reads
          0  redo size
   15806765  bytes sent via SQL*Net to client
      73722  bytes received via SQL*Net from client
       6668  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
    1000001  rows processed

rajesh@10GR2>
rajesh@10GR2> select sum(x),sum(y),sum(z)
  2  from t1
  3  group by cube(z,y,x)
  4  having grouping(x)+grouping(z)+grouping(y) = 0 or
  5         grouping(x)+grouping(z)+grouping(y) = 3
  6  order by 1
  7  /

1000001 rows selected.

Elapsed: 00:01:46.45

Execution Plan
----------------------------------------------------------
Plan hash value: 3795960770

-------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |  1016K|    37M|   695  (23)| 00:00:09 |
|   1 |  SORT ORDER BY         |      |  1016K|    37M|   695  (23)| 00:00:09 |
|*  2 |   FILTER               |      |       |       |            |          |
|   3 |    SORT GROUP BY       |      |  1016K|    37M|   695  (23)| 00:00:09 |
|   4 |     GENERATE CUBE      |      |  1016K|    37M|   695  (23)| 00:00:09 |
|   5 |      SORT GROUP BY     |      |  1016K|    37M|   695  (23)| 00:00:09 |
|   6 |       TABLE ACCESS FULL| T1   |  1016K|    37M|   557   (3)| 00:00:07 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(GROUPING("X")+GROUPING("Z")+GROUPING("Y")=0 OR
              GROUPING("X")+GROUPING("Z")+GROUPING("Y")=3)


Statistics
----------------------------------------------------------
        315  recursive calls
         39  db block gets
       2483  consistent gets
      40208  physical reads
          0  redo size
   15806780  bytes sent via SQL*Net to client
      73722  bytes received via SQL*Net from client
       6668  SQL*Net roundtrips to/from client
          1  sorts (memory)
          2  sorts (disk)
    1000001  rows processed

rajesh@10GR2>
rajesh@10GR2>
rajesh@10GR2> select sum(x),sum(y),sum(z)
  2  from t1
  3  group by grouping sets( (x,y,z),() )
  4  order by 1
  5  /

1000001 rows selected.

Elapsed: 00:00:40.87

Execution Plan
----------------------------------------------------------
Plan hash value: 2567933839

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |  1016K|    37M|   695  (23)| 00:00:09 |
|   1 |  SORT ORDER BY        |      |  1016K|    37M|   695  (23)| 00:00:09 |
|   2 |   SORT GROUP BY ROLLUP|      |  1016K|    37M|   695  (23)| 00:00:09 |
|   3 |    TABLE ACCESS FULL  | T1   |  1016K|    37M|   557   (3)| 00:00:07 |
------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
         43  recursive calls
          7  db block gets
       2483  consistent gets
       5430  physical reads
          0  redo size
   15806780  bytes sent via SQL*Net to client
      73722  bytes received via SQL*Net from client
       6668  SQL*Net roundtrips to/from client
          1  sorts (memory)
          1  sorts (disk)
    1000001  rows processed

See, Grouping set performs data aggregation upon selecting table 'T' and answers it smartly.