Monday, June 10, 2019

Exadata Smart Scan - Part I

Smart scan is the secret sauce of Exadata. It’s what makes Exadata different from every other platform that Oracle runs on. It refers to the concept of moving the processing from the database server to storage layer. It is also a key paradigm shift possible only in Exadata platform.

One of the major problem of large database is the time it takes to transfer the large volume of data (for adhoc queries) between storage and database server. Smart scan is an approach that used to solve the problem of excessive time spent in moving relevant data between tiers.

The idea of this posting is to show that there is more behind the Exadata Database Machine than sophisticated and well-combined Hardware, but there is actually some kind of (Database) Intelligence built into the Storage Layer that enables it to do column & predicate filtering before it sends results to the Database Layer. This functionality is called Smart Scan:

Create a table having 100M rows that sized around 16GB.

c##rajesh@PDB1> exec show_space('BIG_TABLE');
Unformatted Blocks .....................               0
FS1 Blocks (0-25) ......................               0
FS2 Blocks (25-50) .....................               0
FS3 Blocks (50-75) .....................               0
FS4 Blocks (75-100).....................              42
Full Blocks ............................       2,024,510
Total Blocks............................       2,029,448
Total Bytes.............................  16,625,238,016
Total MBytes............................          15,855
Unused Blocks...........................               0
Unused Bytes............................               0
Last Used Ext FileId....................              24
Last Used Ext BlockId...................       2,023,424
Last Used Block.........................           4,824

PL/SQL procedure successfully completed.

c##rajesh@PDB1> select to_char(count(*),'fm999,999,999') cnt from big_table;

CNT
------------
100,000,000

The playing field is setup. We will first look at the scan speed without smart scan.

c##rajesh@PDB1> alter session set statistics_level=all;

Session altered.

c##rajesh@PDB1> set serveroutput off
c##rajesh@PDB1> show parameter cell_offload_processing

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cell_offload_processing              boolean     TRUE
c##rajesh@PDB1> alter session set cell_offload_processing=false;

Session altered.

c##rajesh@PDB1> select count(*) from big_table where owner ='JYU';

  COUNT(*)
----------
      7384

c##rajesh@PDB1> select * from table( dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  bkjgwg8jwctqj, child number 0
-------------------------------------
select count(*) from big_table where owner ='JYU'

Plan hash value: 599409829

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |           |      1 |        |      1 |00:00:19.45 |    2024K|   2024K|
|   1 |  SORT AGGREGATE            |           |      1 |      1 |      1 |00:00:19.45 |    2024K|   2024K|
|*  2 |   TABLE ACCESS STORAGE FULL| BIG_TABLE |      1 |   7384 |   7384 |00:00:19.45 |    2024K|   2024K|
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OWNER"='JYU')


19 rows selected.

We needed around 20 sec to scan about 16GB, with the storage servers delivering the full amount of data into the database layer.

Now the very similar statement with smart scan functionality.

c##rajesh@PDB1> alter session set cell_offload_processing=true;

Session altered.

c##rajesh@PDB1> select count(*) from big_table where owner ='JYU';

  COUNT(*)
----------
      7384

c##rajesh@PDB1> select * from table( dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  bkjgwg8jwctqj, child number 1
-------------------------------------
select count(*) from big_table where owner ='JYU'

Plan hash value: 599409829

--------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |           |      1 |        |      1 |00:00:00.20 |    2024K|   2024K|       |       |          |
|   1 |  SORT AGGREGATE            |           |      1 |      1 |      1 |00:00:00.20 |    2024K|   2024K|       |       |          |
|*  2 |   TABLE ACCESS STORAGE FULL| BIG_TABLE |      1 |   7384 |   7384 |00:00:00.20 |    2024K|   2024K|  1025K|  1025K| 6170K (0)|
--------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - storage("OWNER"='JYU')
       filter("OWNER"='JYU')


20 rows selected.

c##rajesh@PDB1>

Runtime was reduced to less than a second. The presence of “storage” in the predicate section confirms that smart scan – the ability to filter the columns and predicates at storage layer before sending the results to database layer - optimization kick in.

No comments:

Post a Comment