Wednesday, June 26, 2019

Exadata Smart Scan - Part III

 
Exadata smart scan is not only limited to predicate filtering and column projection, but also possible with joins – that is joins can be offloaded to storage tier as well. Offloaded joins are accomplished by creating what is called bloom filters. Bloom filters have been around for some long time since Oracle version 10g release 2, so they are not specific to Exadata. One of the main reason oracle uses them is to reduce the traffic between parallel query slaves. They have the advantage of being very small relative to the data set that they represent. However this comes with the price – they can return false positives. That is, the rows that should not be included in the desired results set can occasionally pass a bloom filter. For that reason additional filter must be applied after the bloom filter to ensure that any false positives are eliminated.
 
The interesting thing about the bloom filters from an exadata perspective is that they may be passed to storage servers and evaluated there. This technique can result in a large decrease in the volume of data that must be transmitted back to the database servers.
 
Here is an example:
 
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cell_offload_processing              boolean     TRUE
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  d53sar94vmh6k, child number 0
-------------------------------------
select /*+ parallel(b1,4) parallel(b2,4) */ b1.object_type,
count(distinct b2.owner) from big_table b1, big_table b2 where b1.id =
b2.id and b1.object_id = b2.object_id and b2.object_type like 'TABLE%'
group by b1.object_type
 
Plan hash value: 564293460
 
------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name      | Starts | E-Rows | Cost (%CPU)|    TQ  |IN-OUT| A-Rows |   A-Time   |
------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |           |      1 |        |   306K(100)|        |      |      3 |00:00:04.35 |
|   1 |  PX COORDINATOR                           |           |      1 |        |            |        |      |      3 |00:00:04.35 |
|   2 |   PX SEND QC (RANDOM)                     | :TQ10003  |      0 |     34 |   306K  (1)|  Q1,03 | P->S |      0 |00:00:00.01 |
|   3 |    HASH GROUP BY                          |           |      4 |     34 |   306K  (1)|  Q1,03 | PCWP |      3 |00:00:00.01 |
|   4 |     PX RECEIVE                            |           |      4 |     34 |   306K  (1)|  Q1,03 | PCWP |      8 |00:00:00.01 |
|   5 |      PX SEND HASH                         | :TQ10002  |      0 |     34 |   306K  (1)|  Q1,02 | P->P |      0 |00:00:00.01 |
|   6 |       HASH GROUP BY                       |           |      4 |     34 |   306K  (1)|  Q1,02 | PCWP |      8 |00:00:00.02 |
|   7 |        VIEW                               | VM_NWVW_1 |      4 |    746 |   306K  (1)|  Q1,02 | PCWP |     30 |00:00:00.02 |
|   8 |         HASH GROUP BY                     |           |      4 |    746 |   306K  (1)|  Q1,02 | PCWP |     30 |00:00:00.02 |
|   9 |          PX RECEIVE                       |           |      4 |    746 |   306K  (1)|  Q1,02 | PCWP |    120 |00:00:00.02 |
|  10 |           PX SEND HASH                    | :TQ10001  |      0 |    746 |   306K  (1)|  Q1,01 | P->P |      0 |00:00:00.01 |
|  11 |            HASH GROUP BY                  |           |      4 |    746 |   306K  (1)|  Q1,01 | PCWP |    120 |00:00:17.01 |
|* 12 |             HASH JOIN                     |           |      4 |   2941K|   306K  (1)|  Q1,01 | PCWP |   3875K|00:00:16.37 |
|  13 |              JOIN FILTER CREATE           | :BF0000   |      4 |   2941K|   153K  (1)|  Q1,01 | PCWP |     15M|00:00:02.69 |
|  14 |               PX RECEIVE                  |           |      4 |   2941K|   153K  (1)|  Q1,01 | PCWP |     15M|00:00:01.14 |
|  15 |                PX SEND BROADCAST          | :TQ10000  |      0 |   2941K|   153K  (1)|  Q1,00 | P->P |      0 |00:00:00.01 |
|  16 |                 PX BLOCK ITERATOR         |           |      4 |   2941K|   153K  (1)|  Q1,00 | PCWC |   3875K|00:00:00.84 |
|* 17 |                  TABLE ACCESS STORAGE FULL| BIG_TABLE |     53 |   2941K|   153K  (1)|  Q1,00 | PCWP |   3875K|00:00:00.81 |
|  18 |              JOIN FILTER USE              | :BF0000   |      4 |    100M|   153K  (1)|  Q1,01 | PCWP |    100M|00:00:02.77 |
|  19 |               PX BLOCK ITERATOR           |           |      4 |    100M|   153K  (1)|  Q1,01 | PCWC |    100M|00:00:02.70 |
|* 20 |                TABLE ACCESS STORAGE FULL  | BIG_TABLE |     53 |    100M|   153K  (1)|  Q1,01 | PCWP |    100M|00:00:02.62 |
------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 
  12 - access("B1"."ID"="B2"."ID" AND "B1"."OBJECT_ID"="B2"."OBJECT_ID")
  17 - storage(:Z>=:Z AND :Z<=:Z AND "B2"."OBJECT_TYPE" LIKE 'TABLE%')
       filter("B2"."OBJECT_TYPE" LIKE 'TABLE%')
  20 - storage(:Z>=:Z AND :Z<=:Z AND SYS_OP_BLOOM_FILTER(:BF0000,"B1"."ID","B1"."OBJECT_ID"))
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"B1"."ID","B1"."OBJECT_ID"))
 
 
 
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cell_offload_processing              boolean     FALSE
 
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  d53sar94vmh6k, child number 1
-------------------------------------
select /*+ parallel(b1,4) parallel(b2,4) */ b1.object_type,
count(distinct b2.owner) from big_table b1, big_table b2 where b1.id =
b2.id and b1.object_id = b2.object_id and b2.object_type like 'TABLE%'
group by b1.object_type
 
Plan hash value: 564293460
 
------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name      | Starts | E-Rows | Cost (%CPU)|    TQ  |IN-OUT| A-Rows |   A-Time   |
------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |           |      1 |        |   306K(100)|        |      |      3 |00:00:18.10 |
|   1 |  PX COORDINATOR                           |           |      1 |        |            |        |      |      3 |00:00:18.10 |
|   2 |   PX SEND QC (RANDOM)                     | :TQ10003  |      0 |     34 |   306K  (1)|  Q1,03 | P->S |      0 |00:00:00.01 |
|   3 |    HASH GROUP BY                          |           |      4 |     34 |   306K  (1)|  Q1,03 | PCWP |      3 |00:00:00.01 |
|   4 |     PX RECEIVE                            |           |      4 |     34 |   306K  (1)|  Q1,03 | PCWP |      8 |00:00:00.01 |
|   5 |      PX SEND HASH                         | :TQ10002  |      0 |     34 |   306K  (1)|  Q1,02 | P->P |      0 |00:00:00.01 |
|   6 |       HASH GROUP BY                       |           |      4 |     34 |   306K  (1)|  Q1,02 | PCWP |      8 |00:00:00.01 |
|   7 |        VIEW                               | VM_NWVW_1 |      4 |    746 |   306K  (1)|  Q1,02 | PCWP |     30 |00:00:00.01 |
|   8 |         HASH GROUP BY                     |           |      4 |    746 |   306K  (1)|  Q1,02 | PCWP |     30 |00:00:00.01 |
|   9 |          PX RECEIVE                       |           |      4 |    746 |   306K  (1)|  Q1,02 | PCWP |    120 |00:00:00.01 |
|  10 |           PX SEND HASH                    | :TQ10001  |      0 |    746 |   306K  (1)|  Q1,01 | P->P |      0 |00:00:00.01 |
|  11 |            HASH GROUP BY                  |           |      4 |    746 |   306K  (1)|  Q1,01 | PCWP |    120 |00:01:12.05 |
|* 12 |             HASH JOIN                     |           |      4 |   2941K|   306K  (1)|  Q1,01 | PCWP |   3875K|00:01:11.44 |
|  13 |              JOIN FILTER CREATE           | :BF0000   |      4 |   2941K|   153K  (1)|  Q1,01 | PCWP |     15M|00:00:28.17 |
|  14 |               PX RECEIVE                  |           |      4 |   2941K|   153K  (1)|  Q1,01 | PCWP |     15M|00:00:26.31 |
|  15 |                PX SEND BROADCAST          | :TQ10000  |      0 |   2941K|   153K  (1)|  Q1,00 | P->P |      0 |00:00:00.01 |
|  16 |                 PX BLOCK ITERATOR         |           |      4 |   2941K|   153K  (1)|  Q1,00 | PCWC |   3875K|00:00:29.16 |
|* 17 |                  TABLE ACCESS STORAGE FULL| BIG_TABLE |     53 |   2941K|   153K  (1)|  Q1,00 | PCWP |   3875K|00:00:29.04 |
|  18 |              JOIN FILTER USE              | :BF0000   |      4 |    100M|   153K  (1)|  Q1,01 | PCWP |   6614K|00:00:35.04 |
|  19 |               PX BLOCK ITERATOR           |           |      4 |    100M|   153K  (1)|  Q1,01 | PCWC |   6614K|00:00:34.79 |
|* 20 |                TABLE ACCESS STORAGE FULL  | BIG_TABLE |     53 |    100M|   153K  (1)|  Q1,01 | PCWP |   6614K|00:00:34.51 |
------------------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
  12 - access("B1"."ID"="B2"."ID" AND "B1"."OBJECT_ID"="B2"."OBJECT_ID")
  17 - storage(:Z>=:Z AND :Z<=:Z)
       filter("B2"."OBJECT_TYPE" LIKE 'TABLE%')
  20 - storage(:Z>=:Z AND :Z<=:Z)
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"B1"."ID","B1"."OBJECT_ID"))
 
 
Notice that our test query with offloading took around 4sec whereas without the offloading took around 18sec. if you look closely at the predicate information in the plans you can observe the bloom filter predicate - SYS_OP_BLOOM_FILTER(:BF0000,"B1"."ID","B1"."OBJECT_ID") – was run on the storage server for the first execution. The offloaded version run faster because the storage servers were able to pre-join the tables, which eliminated a large amount of data that would otherwise have been transferred back to database servers.
 
c##rajesh@PDB1> select child_number,
  2      round(s.io_cell_offload_eligible_bytes/1024/1024,2) as eligible_mb,
  3      round(s.io_interconnect_bytes/1024/1024,2) actual_mb
  4  from v$sql s
  5  where sql_id ='d53sar94vmh6k'
  6  order by 1 ;
 
CHILD_NUMBER ELIGIBLE_MB  ACTUAL_MB
------------ ----------- ----------
           0    31633.63    1215.78
           1           0   31633.63
 
 
With smart scan we just transferred around 1GB of data from storage server to database server. Whereas without smart scan we transferred around 32GB of data.
 

No comments:

Post a Comment