Friday, February 12, 2016

Session Statistics on Global Temporary Tables in Oracle 12c

Prior to Oracle 12c the database don’t maintain separate statistics for global temporary table (GTT), the database maintains one version of statistics shared by all session, even though the data across sessions  could differ.

Starting with 12c we can set the table-level preference GLOBAL_TEMP_TABLE_STATS to either shared or session-specific (global preference default GLOBAL_TEMP_TABLE_STATS to SESSION). Users can gather statistics on GTT and can have own version of session statistics. During optimization the optimizer first check if session statistics exists if yes, then make use of them. If not optimizer uses Shared statistics if they exist.

rajesh@ORA12C> select dbms_stats.get_prefs('GLOBAL_TEMP_TABLE_STATS')
  2  from dual ;

DBMS_STATS.GET_PREFS('GLOBAL_TEMP_TABLE_STATS')
------------------------------------------------------------------------
SESSION

1 row selected.

rajesh@ORA12C> create global temporary table t(x int);

Table created.

rajesh@ORA12C> insert into t(x) values(0);

1 row created.

rajesh@ORA12C> select num_rows,blocks,last_analyzed,scope
  2  from user_tab_statistics
  3  where table_name ='T';

  NUM_ROWS     BLOCKS LAST_ANALYZ SCOPE
---------- ---------- ----------- -------
                                  SHARED

1 row selected.

rajesh@ORA12C> exec dbms_stats.gather_table_stats(user,'T');

PL/SQL procedure successfully completed.

rajesh@ORA12C> select num_rows,blocks,last_analyzed,scope
  2  from user_tab_statistics
  3  where table_name ='T';

  NUM_ROWS     BLOCKS LAST_ANALYZ SCOPE
---------- ---------- ----------- -------
                                  SHARED
         1          1 11-FEB-2016 SESSION

2 rows selected.

We have session level statistics in place, the optimizer now uses this session statistics for optimal execution.

rajesh@ORA12C> select count(data_object_id)
  2  from big_table
  3  where id in (select x from t) ;

COUNT(DATA_OBJECT_ID)
---------------------
                    0

1 row selected.

rajesh@ORA12C> @xplan_cursor

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  3zwrtp37uvz2b, child number 0
-------------------------------------
select count(data_object_id) from big_table where id in (select x from
t)

Plan hash value: 3252046534

---------------------------------------------------------------------------------
| Id  | Operation                     | Name       | Rows  | Bytes | Cost (%CPU)|
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |            |       |       |     5 (100)|
|   1 |  SORT AGGREGATE               |            |     1 |     9 |            |
|   2 |   NESTED LOOPS                |            |     1 |     9 |     5  (20)|
|   3 |    NESTED LOOPS               |            |     1 |     9 |     5  (20)|
|   4 |     SORT UNIQUE               |            |     1 |     2 |     2   (0)|
|   5 |      TABLE ACCESS FULL        | T          |     1 |     2 |     2   (0)|
|*  6 |     INDEX RANGE SCAN          | BIG_IDX_01 |     1 |       |     2   (0)|
|   7 |    TABLE ACCESS BY INDEX ROWID| BIG_TABLE  |     1 |     7 |     2   (0)|
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("ID"="X")

Note
-----
   - this is an adaptive plan
   - Global temporary table session private statistics used


30 rows selected.

The notes section from the plan, shows that it used session statistics for optimization.  When loaded with few more data and did stats gathering, the existing cursor for this session from shared pool got re-loaded with new set of session statistics.

rajesh@ORA12C>
rajesh@ORA12C> insert into t(x)
  2  select rownum
  3  from all_objects
  4  where rownum <=1000;

1000 rows created.

rajesh@ORA12C> exec dbms_stats.gather_table_stats(user,'T');

PL/SQL procedure successfully completed.

rajesh@ORA12C> select num_rows,blocks,last_analyzed,scope
  2  from user_tab_statistics
  3  where table_name ='T';

  NUM_ROWS     BLOCKS LAST_ANAL SCOPE
---------- ---------- --------- -------
                                SHARED
      1001          2 11-feb-16 SESSION

2 rows selected.

rajesh@ORA12C> set autotrace traceonly explain
rajesh@ORA12C> select *
  2  from big_table
  3  where id in (select x from t) ;

Execution Plan
----------------------------------------------------------
Plan hash value: 2358921057

--------------------------------------------------------------------------------
| Id  | Operation                    | Name       | Rows  | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |            |  1001 |   108K|  1506   (1)|
|   1 |  NESTED LOOPS                |            |  1001 |   108K|  1506   (1)|
|   2 |   NESTED LOOPS               |            |  1001 |   108K|  1506   (1)|
|   3 |    SORT UNIQUE               |            |  1001 |  4004 |     2   (0)|
|   4 |     TABLE ACCESS FULL        | T          |  1001 |  4004 |     2   (0)|
|*  5 |    INDEX RANGE SCAN          | BIG_IDX_01 |     1 |       |     2   (0)|
|   6 |   TABLE ACCESS BY INDEX ROWID| BIG_TABLE  |     1 |   107 |     3   (0)|
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("ID"="X")

Note
-----
   - this is an adaptive plan
   - Global temporary table session private statistics used

rajesh@ORA12C> set autotrace off
rajesh@ORA12C>

Now from a different session tried this.

rajesh@ORA12C> select num_rows,blocks,last_analyzed,scope
  2  from user_tab_statistics
  3  where table_name ='T';

  NUM_ROWS     BLOCKS LAST_ANALYZ SCOPE
---------- ---------- ----------- -------
                                  SHARED

1 row selected.

rajesh@ORA12C> set serveroutput off
rajesh@ORA12C> select count(data_object_id)
  2  from big_table
  3  where id in (select x from t) ;

COUNT(DATA_OBJECT_ID)
---------------------
                    0

1 row selected.

rajesh@ORA12C> @xplan_cursor

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------
SQL_ID  3zwrtp37uvz2b, child number 0
-------------------------------------
select count(data_object_id) from big_table where id in (select x from
t)

Plan hash value: 3252046534

---------------------------------------------------------------------------------
| Id  | Operation                     | Name       | Rows  | Bytes | Cost (%CPU)|
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |            |       |       |     6 (100)|
|   1 |  SORT AGGREGATE               |            |     1 |    20 |            |
|   2 |   NESTED LOOPS                |            |     1 |    20 |     6  (17)|
|   3 |    NESTED LOOPS               |            |     1 |    20 |     6  (17)|
|   4 |     SORT UNIQUE               |            |     1 |    13 |     2   (0)|
|   5 |      TABLE ACCESS FULL        | T          |     1 |    13 |     2   (0)|
|*  6 |     INDEX RANGE SCAN          | BIG_IDX_01 |     1 |       |     2   (0)|
|   7 |    TABLE ACCESS BY INDEX ROWID| BIG_TABLE  |     1 |     7 |     3   (0)|
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("ID"="X")

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - this is an adaptive plan


30 rows selected.

Since no session-specific and shared statistics Dynamic sampling kicked in, with shared statistics in place, optimizer make use of shared statistics.

rajesh@ORA12C>
rajesh@ORA12C>
rajesh@ORA12C> begin
  2     dbms_stats.set_table_prefs(user,'T',
  3             'GLOBAL_TEMP_TABLE_STATS','SHARED');
  4     dbms_stats.set_table_stats(user,'T',
  5             numrows=>5000,
  6             numblks=>500);
  7  end;
  8  /

PL/SQL procedure successfully completed.

rajesh@ORA12C> select num_rows,blocks,last_analyzed,scope
  2  from user_tab_statistics
  3  where table_name ='T';

  NUM_ROWS     BLOCKS LAST_ANALYZ SCOPE
---------- ---------- ----------- -------
      5000        500 11-FEB-2016 SHARED

1 row selected.

rajesh@ORA12C> begin
  2     dbms_stats.set_table_prefs(user,'T',
  3             'GLOBAL_TEMP_TABLE_STATS','SESSION');
  4  end;
  5  /

PL/SQL procedure successfully completed.

rajesh@ORA12C> select num_rows,blocks,last_analyzed,scope
  2  from user_tab_statistics
  3  where table_name ='T';

  NUM_ROWS     BLOCKS LAST_ANALYZ SCOPE
---------- ---------- ----------- -------
      5000        500 11-FEB-2016 SHARED

1 row selected.

rajesh@ORA12C> select count(data_object_id)
  2  from big_table
  3  where id in (select x from t) ;

COUNT(DATA_OBJECT_ID)
---------------------
                    0

1 row selected.

rajesh@ORA12C>
rajesh@ORA12C> @xplan_cursor

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------
SQL_ID  3zwrtp37uvz2b, child number 0
-------------------------------------
select count(data_object_id) from big_table where id in (select x from
t)

Plan hash value: 3252046534

---------------------------------------------------------------------------------
| Id  | Operation                     | Name       | Rows  | Bytes | Cost (%CPU)|
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |            |       |       |   606 (100)|
|   1 |  SORT AGGREGATE               |            |     1 |    20 |            |
|   2 |   NESTED LOOPS                |            |   156 |  3120 |   606   (1)|
|   3 |    NESTED LOOPS               |            |   156 |  3120 |   606   (1)|
|   4 |     SORT UNIQUE               |            |  5000 | 65000 |   137   (0)|
|   5 |      TABLE ACCESS FULL        | T          |  5000 | 65000 |   137   (0)|
|*  6 |     INDEX RANGE SCAN          | BIG_IDX_01 |     1 |       |     2   (0)|
|   7 |    TABLE ACCESS BY INDEX ROWID| BIG_TABLE  |     1 |     7 |     3   (0)|
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("ID"="X")

Note
-----
   - this is an adaptive plan


29 rows selected.

See, no indication of “session private statistics” in notes section.  With Session-specific and shared statistics in place, session statistics take more preference for optimization.

rajesh@ORA12C> insert into t(x)
  2  select rownum
  3  from all_objects;

89984 rows created.

rajesh@ORA12C>
rajesh@ORA12C> exec dbms_stats.gather_table_stats(user,'T');

PL/SQL procedure successfully completed.

rajesh@ORA12C> select num_rows,blocks,last_analyzed,scope
  2  from user_tab_statistics
  3  where table_name ='T';

  NUM_ROWS     BLOCKS LAST_ANALYZ SCOPE
---------- ---------- ----------- -------
      5000        500 11-FEB-2016 SHARED
     89984        137 11-FEB-2016 SESSION

2 rows selected.

rajesh@ORA12C> select count(data_object_id)
  2  from big_table
  3  where id in (select x from t) ;

COUNT(DATA_OBJECT_ID)
---------------------
                 6534

1 row selected.

rajesh@ORA12C>
rajesh@ORA12C> @xplan_cursor

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  3zwrtp37uvz2b, child number 0
-------------------------------------
select count(data_object_id) from big_table where id in (select x from
t)

Plan hash value: 2375446597

-----------------------------------------------------------------------
| Id  | Operation            | Name      | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT     |           |       |       |  4675 (100)|
|   1 |  SORT AGGREGATE      |           |     1 |    12 |            |
|*  2 |   HASH JOIN          |           | 89984 |  1054K|  4675   (1)|
|   3 |    SORT UNIQUE       |           | 89984 |   439K|    39   (0)|
|   4 |     TABLE ACCESS FULL| T         | 89984 |   439K|    39   (0)|
|   5 |    TABLE ACCESS FULL | BIG_TABLE |  1000K|  6835K|  4355   (1)|
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"="X")

Note
-----
   - this is an adaptive plan
   - Global temporary table session private statistics used


28 rows selected.

rajesh@ORA12C>

Function based index and missing statistics

You have a column with 100 distinct values and out of those if you select just one values (assuming even distribution of data) you will select approximately 1% of data in that table.

However if the table potentially allow mixed set of values like ‘Hello’, ‘hello’ or ‘HEllo’ etc, therefore you decide to write a query that first converts all the fields to upper case such that you can now compare all possible values of specific name to HELLO.

Now, what is the selectivity of the column after you converted everything to upper case? There is no easy answer to this, it all depends on how many distinct values haves in our data.

However, what if the function did something more complex and converted the name to some numerical values based on the sum of all characters via some wired formula. In theory you could have no distinct values in that everything could get converted into a NULL or you could possibly have many distinct values.

Again, oracle can’t really tell how many rows get selected based on the determination of such function. When a predicate is used with a function call, oracle has as great difficulty in determining the correct selectivity. Therefore when you create a function based index, oracle behind the scenes create a hidden virtual column on the parent table in order to capture the data characteristics of the function so that CBO can make an accurate determination of selectivity associated with that function. 

If oracle knows the low_value, high_value, num of nulls and NDV on a column it can then accurately determine the selectivity and cardinality when the function is used on the column and hence calculate an accurate cost and determine whether the use of FBI is appropriate.

When function based index is created, oracle will now (even with 12c) automatically collects the statistics associated with the index (like blevel, number of leaf blocks, CF etc) but not the statistics associated with the hidden virtual columns as these statistics are associated with the table and not with the index.

rajesh@ORA12C> create table t(x int, flag varchar2(1));

Table created.

rajesh@ORA12C> insert /*+ append */ into t(x,flag)
  2  select rownum,case when rownum <=10
  3                     then 'N' else 'Y' end
  4  from big_table;

1000000 rows created.

rajesh@ORA12C> commit;

Commit complete.

rajesh@ORA12C> create index t_idx on t(decode(flag,'Y',1));

Index created.

rajesh@ORA12C>

Behind the scenes oracle has created a hidden virtual columns.

rajesh@ORA12C> select column_name,num_distinct,low_value,
  2          high_value,density,num_nulls,last_analyzed
  3  from user_tab_cols
  4  where table_name ='T'
  5  order by column_name;

COLUMN_NAME     NUM_DISTINCT LOW_V HIGH_    DENSITY  NUM_NULLS LAST_ANALYZ
--------------- ------------ ----- ----- ---------- ---------- -----------
FLAG                       2 4E    59            .5          0 11-FEB-2016
SYS_NC00003$   
X                    1000000 C102  C402     .000001          0 11-FEB-2016
                
3 rows selected.

Therefore even after you’ve created the index, Oracle still has no idea on the selectivity associated the function because the necessary virtual column statistics are still missing. Oracle simply makes “guess” as with more guesses it will quite likely to be wrong. Which means selectivity like to be wrong in-turn costing will be wrong in-turn execution plan will be wrong.

rajesh@ORA12C> set autotrace traceonly explain statistics
rajesh@ORA12C> select * from t where decode(flag,'Y',1)=1;

999990 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 767293772

---------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       | 10000 | 70000 |   460   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T     | 10000 | 70000 |   460   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T_IDX |  4000 |       |  1952   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(DECODE("FLAG",'Y',1)=1)


Statistics
----------------------------------------------------------
          3  recursive calls
          0  db block gets
      16947  consistent gets
       3609  physical reads
          0  redo size
   11253571  bytes sent via SQL*Net to client
      73878  bytes received via SQL*Net from client
       6668  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     999990  rows processed

rajesh@ORA12C> select * from t where flag='Y';

999990 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   500K|  3417K|   467   (2)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |   500K|  3417K|   467   (2)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("FLAG"='Y')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       8330  consistent gets
          0  physical reads
          0  redo size
   11253571  bytes sent via SQL*Net to client
      73878  bytes received via SQL*Net from client
       6668  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     999990  rows processed

rajesh@ORA12C> set autotrace off

What should be done after creating a function based indexed? You should collect statistics on the HIDDEN columns, so that CBO can accurately determine the cardinality associated with using the function and so make correct decision regarding the most appropriate execution plan.

rajesh@ORA12C> begin
  2     dbms_stats.gather_table_stats(user,'T',
  3             method_opt=>'for all hidden columns size 1',
  4             no_invalidate=>false);
  5  end;
  6  /

PL/SQL procedure successfully completed.

rajesh@ORA12C> select column_name,num_distinct,low_value,
  2          high_value,density,num_nulls,last_analyzed
  3  from user_tab_cols
  4  where table_name ='T'
  5  order by column_name;

COLUMN_NAME     NUM_DISTINCT LOW_V HIGH_    DENSITY  NUM_NULLS LAST_ANALYZED
--------------- ------------ ----- ----- ---------- ---------- --------------------
FLAG                       2 4E    59            .5          0 11-FEB-2016 13:59:44
SYS_NC00003$               1 C102  C102           1         10 11-FEB-2016 14:00:18
X                    1000000 C102  C402     .000001          0 11-FEB-2016 13:59:44
               
3 rows selected.

rajesh@ORA12C> set autotrace traceonly explain statistics
rajesh@ORA12C> select * from t where decode(flag,'Y',1)=1;

999990 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   999K|  9765K|   474   (3)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |   999K|  9765K|   474   (3)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(DECODE("FLAG",'Y',1)=1)


Statistics
----------------------------------------------------------
         28  recursive calls
          0  db block gets
       8377  consistent gets
          0  physical reads
          0  redo size
   11253571  bytes sent via SQL*Net to client
      73878  bytes received via SQL*Net from client
       6668  SQL*Net roundtrips to/from client
          6  sorts (memory)
          0  sorts (disk)
     999990  rows processed

rajesh@ORA12C>
rajesh@ORA12C> set autotrace off
rajesh@ORA12C>