Thursday, March 22, 2018

Cursor invalidation (no_invalidate param in stats gathering)

One of the nice capability of Oracle database is the ability to reuse the cursor/plans across the multiple executions. However reusing the plan for each execution will reduce the total (hard) parsing done in the database that in-turn improve the scalability of the application.
However reusing the plan in the ETL or warehousing environment can be dangerous, since the data volume returned for each execution will be different. So the plan generated for the first execution will not be the optimal plan for the subsequent execution.
Even the stats gathered before the subsequent execution won’t help unless until no_invalidate option is included in the stats gathering process.
Here is the demo, Created a table with index on the ID column, then data loaded and gathered the stats on this table along with a histogram on the ID column (since that column is massively skewed)
demo@ORA12C> create table t(id number,code varchar2(60),some_other_text varchar2(60) );
 
Table created.
 
demo@ORA12C> create index t_idx on t(id);
 
Index created.
 
demo@ORA12C> insert into t(id,code,some_other_text)
  2  select decode(rownum,1,1,99), substr(object_name,1,60), rpad('*',60)
  3  from big_table;
 
1000000 rows created.
 
demo@ORA12C> begin
  2     dbms_stats.gather_table_stats(
  3             ownname=>user,
  4             tabname=>'T',
  5             method_opt=>'for columns id size 254');
  6  end;
  7  /
 
PL/SQL procedure successfully completed.
 
Let’s run the query and see what plan it picks
demo@ORA12C> set serveroutput off
demo@ORA12C> select max(code) from t where id = 1;
 
MAX(CODE)
------------------------------------------------------------
jdk/nashorn/internal/runtime/SpillProperty
 
demo@ORA12C> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
 
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------
SQL_ID  apbgnwvfasyfs, child number 0
-------------------------------------
select max(code) from t where id = 1
 
Plan hash value: 1339972470
 
--------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |      1 |        |      1 |00:00:00.01 |       4 |
|   1 |  SORT AGGREGATE                      |       |      1 |      1 |      1 |00:00:00.01 |       4 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T     |      1 |      1 |      1 |00:00:00.01 |       4 |
|*  3 |    INDEX RANGE SCAN                  | T_IDX |      1 |      1 |      1 |00:00:00.01 |       3 |
--------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("ID"=1)
 
Just to return one row – matching the predicate – the optimizer picked up the index on the ID column & query completed in less than a second, by doing four logical IO per execution.
Now let us change the data distribution in this table and see what plan does it picks in the next run.
demo@ORA12C> select loads, child_number,invalidations
  2  from v$sql
  3  where sql_id ='apbgnwvfasyfs';
 
     LOADS CHILD_NUMBER INVALIDATIONS
---------- ------------ -------------
         1            0             0
 
demo@ORA12C> insert into t(id,code,some_other_text)
  2  select 1, substr(object_name,1,60), rpad('*',60)
  3  from big_table;
 
1000000 rows created.
 
demo@ORA12C> begin
  2     dbms_stats.gather_table_stats(
  3             ownname=>user,
  4             tabname=>'T',
  5             method_opt=>'for columns id size 254');
  6  end;
  7  /
 
PL/SQL procedure successfully completed.
 
demo@ORA12C> select loads, child_number,invalidations
  2  from v$sql
  3  where sql_id ='apbgnwvfasyfs';
 
     LOADS CHILD_NUMBER INVALIDATIONS
---------- ------------ -------------
         1            0             0
 
 
Added a bunch of records to the table with ID=1 and gathering the stats doesn’t invalidates the existing cursor in the library cache.
 
demo@ORA12C> select max(code) from t where id = 1;
 
MAX(CODE)
------------------------------------------------------------
sun/util/xml/PlatformXmlPropertiesProvider$Resolver
 
demo@ORA12C> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
 
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------
SQL_ID  apbgnwvfasyfs, child number 0
-------------------------------------
select max(code) from t where id = 1
 
Plan hash value: 1339972470
 
-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |      1 |        |      1 |00:01:24.03 |   34347 |  15652 |
|   1 |  SORT AGGREGATE                      |       |      1 |      1 |      1 |00:01:24.03 |   34347 |  15652 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T     |      1 |      1 |   1000K|00:01:23.73 |   34347 |  15652 |
|*  3 |    INDEX RANGE SCAN                  | T_IDX |      1 |      1 |   1000K|00:00:08.43 |    5109 |   1067 |
-----------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("ID"=1)
 
 
20 rows selected.
 
Executing the same sql again ended up with reusing the plan and now it get to return half of the result sets from the table using the same index based access (using index based access to return half the volume of data from the table is something seriously wrong). And now it took nearly more than a min and took about 34K logical IO to complete.
 
However having the “no_invalidate” option during the stats gathering will lead the existing child cursor to be invalidated, that doesn’t mean it will be discarded entirely, it simply means the existing information about the child cursor cannot be used, instead it must be reloaded on the next invocation.
 
demo@ORA12C> select loads, child_number,invalidations
  2  from v$sql
  3  where sql_id ='apbgnwvfasyfs';
 
     LOADS CHILD_NUMBER INVALIDATIONS
---------- ------------ -------------
         1            0             0
 
demo@ORA12C> begin
  2     dbms_stats.gather_table_stats(
  3             ownname=>user,
  4             tabname=>'T',
  5             method_opt=>'for columns id size 254',
  6             no_invalidate=>false);
  7  end;
  8  /
 
PL/SQL procedure successfully completed.
 
demo@ORA12C> select loads, child_number,invalidations
  2  from v$sql
  3  where sql_id ='apbgnwvfasyfs';
 
     LOADS CHILD_NUMBER INVALIDATIONS
---------- ------------ -------------
         1            0             1
 
demo@ORA12C> select max(code) from t where id = 1;
 
MAX(CODE)
------------------------------------------------------------
sun/util/xml/PlatformXmlPropertiesProvider$Resolver
 
demo@ORA12C> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
 
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------
SQL_ID  apbgnwvfasyfs, child number 1
-------------------------------------
select max(code) from t where id = 1
 
Plan hash value: 2966233522
 
----------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:06.80 |   29483 |  29465 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:06.80 |   29483 |  29465 |
|*  2 |   TABLE ACCESS FULL| T    |      1 |   1000K|   1000K|00:00:06.78 |   29483 |  29465 |
----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("ID"=1)
 
 
19 rows selected.
 
So instead of reusing the plan from the library cache, we invalidated the existing plan during stats gathering and upon execution we picked the optimal plan (i.e. Full Table scan rather than index based access path) based on the current data statistics present in the table.

No comments:

Post a Comment