One of the nice capability of Oracle database is the ability
to reuse the cursor/plans across the multiple executions. However reusing the
plan for each execution will reduce the total (hard) parsing done in the
database that in-turn improve the scalability of the application.
However reusing the plan in the ETL or warehousing
environment can be dangerous, since the data volume returned for each execution
will be different. So the plan generated for the first execution will not be
the optimal plan for the subsequent execution.
Even the stats gathered before the subsequent execution
won’t help unless until no_invalidate
option is included in the stats gathering process.
Here is the demo, Created a table with index on the ID
column, then data loaded and gathered the stats on this table along with a
histogram on the ID column (since that column is massively skewed)
demo@ORA12C> create table t(id number,code
varchar2(60),some_other_text varchar2(60) );
Table created.
demo@ORA12C> create index t_idx on t(id);
Index created.
demo@ORA12C> insert into t(id,code,some_other_text)
2 select decode(rownum,1,1,99),
substr(object_name,1,60), rpad('*',60)
3 from big_table;
1000000 rows created.
demo@ORA12C> begin
2 dbms_stats.gather_table_stats(
3 ownname=>user,
4 tabname=>'T',
5 method_opt=>'for columns id
size 254');
6 end;
7 /
PL/SQL procedure successfully completed.
Let’s run the query and see what plan it picks
demo@ORA12C> set serveroutput off
demo@ORA12C> select max(code) from t where id = 1;
MAX(CODE)
------------------------------------------------------------
jdk/nashorn/internal/runtime/SpillProperty
demo@ORA12C> select * from table(dbms_xplan.display_cursor(format=>'allstats
last'));
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------
SQL_ID apbgnwvfasyfs,
child number 0
-------------------------------------
select max(code) from t where id = 1
Plan hash value: 1339972470
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time
| Buffers |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT
STATEMENT | |
1 | | 1 |00:00:00.01 | 4 |
| 1 | SORT AGGREGATE | |
1 | 1 | 1 |00:00:00.01 | 4 |
| 2 | TABLE ACCESS BY INDEX ROWID BATCHED| T |
1 | 1 | 1 |00:00:00.01 | 4 |
|* 3 | INDEX RANGE
SCAN | T_IDX | 1 |
1 | 1 |00:00:00.01 | 3 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 -
access("ID"=1)
Just to return one row – matching the predicate – the
optimizer picked up the index on the ID column & query completed in less
than a second, by doing four logical IO per execution.
Now let us change the data distribution in this table and
see what plan does it picks in the next run.
demo@ORA12C> select loads, child_number,invalidations
2 from v$sql
3 where sql_id ='apbgnwvfasyfs';
LOADS CHILD_NUMBER
INVALIDATIONS
---------- ------------ -------------
1 0 0
demo@ORA12C> insert into t(id,code,some_other_text)
2 select 1, substr(object_name,1,60),
rpad('*',60)
3 from big_table;
1000000 rows created.
demo@ORA12C> begin
2 dbms_stats.gather_table_stats(
3 ownname=>user,
4 tabname=>'T',
5 method_opt=>'for columns id
size 254');
6 end;
7 /
PL/SQL procedure successfully completed.
demo@ORA12C> select loads, child_number,invalidations
2 from v$sql
3 where sql_id ='apbgnwvfasyfs';
LOADS CHILD_NUMBER
INVALIDATIONS
---------- ------------ -------------
1 0 0
Added a bunch of records to the table with ID=1 and gathering the stats
doesn’t invalidates the existing cursor in the library cache.
demo@ORA12C> select max(code) from t where id = 1;
MAX(CODE)
------------------------------------------------------------
sun/util/xml/PlatformXmlPropertiesProvider$Resolver
demo@ORA12C> select * from
table(dbms_xplan.display_cursor(format=>'allstats last'));
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------
SQL_ID apbgnwvfasyfs,
child number 0
-------------------------------------
select max(code) from t where id = 1
Plan hash value: 1339972470
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time
| Buffers | Reads |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT
STATEMENT | |
1 | | 1 |00:01:24.03 | 34347 | 15652 |
| 1 | SORT AGGREGATE | |
1 | 1 | 1 |00:01:24.03 | 34347 |
15652 |
| 2 | TABLE ACCESS BY INDEX ROWID BATCHED| T |
1 | 1 | 1000K|00:01:23.73 | 34347 |
15652 |
|* 3 | INDEX RANGE
SCAN | T_IDX | 1 |
1 | 1000K|00:00:08.43 | 5109 |
1067 |
-----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 -
access("ID"=1)
20 rows selected.
Executing the same sql again ended up with reusing the plan and now it get
to return half of the result sets from the table using the same index based
access (using index based access to return half the volume of data from the
table is something seriously wrong). And now it took nearly more than a min and
took about 34K logical IO to complete.
However having the “no_invalidate”
option during the stats gathering will lead the existing child cursor to be
invalidated, that doesn’t mean it will be discarded entirely, it simply means
the existing information about the child cursor cannot be used, instead it must
be reloaded on the next invocation.
demo@ORA12C>
select loads, child_number,invalidations
2 from
v$sql
3
where sql_id ='apbgnwvfasyfs';
LOADS CHILD_NUMBER INVALIDATIONS
----------
------------ -------------
1 0 0
demo@ORA12C>
begin
2
dbms_stats.gather_table_stats(
3
ownname=>user,
4
tabname=>'T',
5
method_opt=>'for columns id size 254',
6
no_invalidate=>false);
7 end;
8 /
PL/SQL
procedure successfully completed.
demo@ORA12C>
select loads, child_number,invalidations
2 from
v$sql
3
where sql_id ='apbgnwvfasyfs';
LOADS CHILD_NUMBER INVALIDATIONS
----------
------------ -------------
1 0 1
demo@ORA12C>
select max(code) from t where id = 1;
MAX(CODE)
------------------------------------------------------------
sun/util/xml/PlatformXmlPropertiesProvider$Resolver
demo@ORA12C>
select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------
SQL_ID apbgnwvfasyfs, child number 1
-------------------------------------
select max(code)
from t where id = 1
Plan hash
value: 2966233522
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows
| A-Time | Buffers | Reads |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT |
| 1 | |
1 |00:00:06.80 | 29483 |
29465 |
| 1 |
SORT AGGREGATE | |
1 | 1 | 1 |00:00:06.80 | 29483 |
29465 |
|* 2 | TABLE ACCESS FULL| T
| 1 | 1000K|
1000K|00:00:06.78 | 29483
| 29465 |
----------------------------------------------------------------------------------------------
Predicate
Information (identified by operation id):
---------------------------------------------------
2 - filter("ID"=1)
19 rows
selected.
So instead of reusing the plan from the library cache, we invalidated
the existing plan during stats gathering and upon execution we picked the
optimal plan (i.e. Full Table scan rather than index based access path) based
on the current data statistics present in the table.
No comments:
Post a Comment