Smart scan is the secret sauce of Exadata. It’s what makes Exadata
different from every other platform that Oracle runs on. It refers to the
concept of moving the processing from the database server to storage layer. It
is also a key paradigm shift possible only in Exadata platform.
One of the major problem of large database is the time it takes to
transfer the large volume of data (for adhoc queries) between storage and
database server. Smart scan is an approach that used to solve the problem of
excessive time spent in moving relevant data between tiers.
The idea of this posting is to show that there is more behind the
Exadata Database Machine than sophisticated and well-combined Hardware, but
there is actually some kind of (Database) Intelligence built into the Storage
Layer that enables it to do column & predicate filtering before it sends
results to the Database Layer. This functionality is called Smart Scan:
Create a table having 100M rows that sized around 16GB.
c##rajesh@PDB1> exec show_space('BIG_TABLE');
Unformatted Blocks .....................
0
FS1 Blocks (0-25) ......................
0
FS2 Blocks (25-50) .....................
0
FS3 Blocks (50-75) .....................
0
FS4 Blocks (75-100).....................
42
Full Blocks ............................
2,024,510
Total Blocks............................
2,029,448
Total Bytes.............................
16,625,238,016
Total MBytes............................
15,855
Unused Blocks...........................
0
Unused Bytes............................
0
Last Used Ext FileId....................
24
Last Used Ext BlockId...................
2,023,424
Last Used Block.........................
4,824
PL/SQL procedure successfully completed.
c##rajesh@PDB1> select
to_char(count(*),'fm999,999,999') cnt from big_table;
CNT
------------
100,000,000
The playing field is setup. We will first look at the scan speed
without smart scan.
c##rajesh@PDB1> alter session set
statistics_level=all;
Session altered.
c##rajesh@PDB1> set serveroutput off
c##rajesh@PDB1> show parameter
cell_offload_processing
NAME TYPE VALUE
------------------------------------ -----------
------------------------------
cell_offload_processing boolean TRUE
c##rajesh@PDB1> alter session set
cell_offload_processing=false;
Session altered.
c##rajesh@PDB1> select count(*) from big_table
where owner ='JYU';
COUNT(*)
----------
7384
c##rajesh@PDB1> select * from table(
dbms_xplan.display_cursor(format=>'allstats last'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID
bkjgwg8jwctqj, child number 0
-------------------------------------
select count(*) from big_table where owner
='JYU'
Plan hash value: 599409829
-----------------------------------------------------------------------------------------------------------
| Id |
Operation | Name | Starts | E-Rows | A-Rows | A-Time
| Buffers | Reads |
-----------------------------------------------------------------------------------------------------------
| 0 |
SELECT STATEMENT | | 1 |
| 1 |00:00:19.45 | 2024K|
2024K|
| 1
| SORT AGGREGATE | | 1 |
1 | 1 |00:00:19.45 | 2024K|
2024K|
|* 2
| TABLE ACCESS STORAGE FULL| BIG_TABLE
| 1 | 7384 |
7384 |00:00:19.45 | 2024K| 2024K|
-----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation
id):
---------------------------------------------------
2 -
filter("OWNER"='JYU')
19 rows selected.
We needed around 20 sec to scan about 16GB, with the storage servers
delivering the full amount of data into the database layer.
Now the very similar statement with smart scan functionality.
c##rajesh@PDB1> alter session set
cell_offload_processing=true;
Session altered.
c##rajesh@PDB1> select count(*) from
big_table where owner ='JYU';
COUNT(*)
----------
7384
c##rajesh@PDB1> select * from table(
dbms_xplan.display_cursor(format=>'allstats last'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID bkjgwg8jwctqj,
child number 1
-------------------------------------
select count(*) from big_table where owner
='JYU'
Plan hash value: 599409829
--------------------------------------------------------------------------------------------------------------------------------------
| Id |
Operation | Name | Starts | E-Rows | A-Rows | A-Time
| Buffers | Reads | OMem |
1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------
| 0 |
SELECT STATEMENT | | 1 |
| 1 |00:00:00.20 | 2024K|
2024K| | | |
| 1
| SORT AGGREGATE | | 1 |
1 | 1 |00:00:00.20 | 2024K|
2024K| | | |
|* 2
| TABLE ACCESS STORAGE FULL| BIG_TABLE
| 1 | 7384 |
7384 |00:00:00.20 | 2024K| 2024K|
1025K| 1025K| 6170K (0)|
--------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation
id):
---------------------------------------------------
2 - storage("OWNER"='JYU')
filter("OWNER"='JYU')
20 rows selected.
c##rajesh@PDB1>
Runtime was reduced to less than a second. The presence of “storage” in
the predicate section confirms that smart scan – the ability to filter the
columns and predicates at storage layer before sending the results to database
layer - optimization kick in.
No comments:
Post a Comment