Monday, January 17, 2022

Real Time statistics - Part VII

In this blog post we are going to see about the effect of real time statistics feature with table truncate and reload operations.
 
demo@PDB19> show parameter optimizer_real_time_statistics
 
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_real_time_statistics       boolean     FALSE
demo@PDB19> alter session set optimizer_real_time_statistics=true;
 
Session altered.
 
demo@PDB19> create table t(
  2     n1 number primary key,
  3     n2 number ,
  4     n3 varchar2(80) );
 
Table created.
 
demo@PDB19> insert /*+ append */ into t(n1,n2,n3)
  2  with rws as ( select rownum n from dual connect by level <= 1000 )
  3  select n, ceil(sqrt(n)),
  4     to_char(to_date(n,'j'),'jsp')
  5  from rws;
 
1000 rows created.
 
demo@PDB19> commit;
 
Commit complete.
 
demo@PDB19> truncate table t;
 
Table truncated.
 
That above insert statement was a direct path insert – which led to basic statistics gathered on the table – however this truncate statement removed the data from the table but not its statistics
 
demo@PDB19> select num_rows,blocks,monitoring
  2  from user_tables
  3  where table_name ='T';
 
  NUM_ROWS     BLOCKS MON
---------- ---------- ---
      1000          8 YES
 
demo@PDB19> select num_rows,blocks,stale_stats,notes
  2  from user_tab_statistics
  3  where table_name ='T';
 
  NUM_ROWS     BLOCKS STALE_S NOTES
---------- ---------- ------- -------------------------
      1000          8 YES
 
Now when we reload the data using conventional path dml, the real-time statistics will kick in and end up like this
 
demo@PDB19> insert into t(n1,n2,n3)
  2  with rws as ( select rownum+1000 n from dual connect by level <= 10000 )
  3  select n, ceil(sqrt(n)),
  4     to_char(to_date(n,'j'),'jsp')
  5  from rws;
 
10000 rows created.
 
demo@PDB19> commit;
 
Commit complete.
 
demo@PDB19> exec dbms_stats.flush_database_monitoring_info;
 
PL/SQL procedure successfully completed.
 
demo@PDB19> select num_rows,blocks,stale_stats,notes
  2  from user_tab_statistics
  3  where table_name ='T';
 
  NUM_ROWS     BLOCKS STALE_S NOTES
---------- ---------- ------- -------------------------
      1000          8 YES
     10000         73         STATS_ON_CONVENTIONAL_DML
 
demo@PDB19> select t1.column_name,t1.num_nulls,t1.num_distinct,t1.low_value,get_stats_val(t2.data_type,t1.low_value) low_val,
  2      t1.high_value,get_stats_val(t2.data_type,t1.high_value) high_val, t1.histogram , t1.notes
  3  from user_tab_col_statistics t1,
  4       user_tab_columns t2
  5  where t1.table_name ='T'
  6  and t1.table_name = t2.table_name
  7  and t1.column_name = t2.column_name;
 
COLUMN_NAM  NUM_NULLS NUM_DISTINCT LOW_VALUE  LOW_VAL    HIGH_VALUE HIGH_VAL             HISTOGRAM       NOTES
---------- ---------- ------------ ---------- ---------- ---------- -------------------- --------------- -------------------------
N1                  0         1000 C102       1          C20B       1000                 NONE            STATS_ON_LOAD
N2                  0           32 C102       1          C121       32                   NONE            STATS_ON_LOAD
N3                  0         1000 6569676874 eight      74776F2068 two hundred two      NONE            STATS_ON_LOAD
                                                         756E647265
                                                         642074776F
 
N1                  0              C102       1          C3020A59   10988                                STATS_ON_CONVENTIONAL_DML
N2                  0              C102       1          C20206     105                                  STATS_ON_CONVENTIONAL_DML
N3                  0              6569676874 eight      74776F2074 two thousand two hun                 STATS_ON_CONVENTIONAL_DML
                                                         686F757361 dred twenty-six
                                                         6E64207477
                                                         6F2068756E
                                                         6472656420
                                                         7477656E74
                                                         792D736978
 
 
The base statistics says we got 1000 rows initially load and then real-time statistics says we got 10K rows loaded, now when we parse a sql with predicate matching the subset of data in the range that got truncated before returns no rows (because the data in the range of 1 to 1000 got truncated).
 
demo@PDB19> select /*+ gather_plan_statistics */ count(*) from t where n1 < 501;
 
  COUNT(*)
----------
         0
 
demo@PDB19> select * from table( dbms_xplan.display_cursor(format=>'allstats last'));
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
SQL_ID  agtrz8n1288u0, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from t where n1 < 501
 
Plan hash value: 1241160216
 
-------------------------------------------------------------------------------------------
| Id  | Operation         | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |             |      1 |        |      1 |00:00:00.01 |       2 |
|   1 |  SORT AGGREGATE   |             |      1 |      1 |      1 |00:00:00.01 |       2 |
|*  2 |   INDEX RANGE SCAN| SYS_C007944 |      1 |    455 |      0 |00:00:00.01 |       2 |
-------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("N1"<501)
 
Note
-----
   - dynamic statistics used: statistics for conventional DML
 
 
23 rows selected.
 
demo@PDB19>
 
but looking at the column level statistics it says the column N1 got data in the range of 1 to 11000, so when we parse the sql, the optimizer took into this column level statistics and estimated the cardinality as
 
cardinality = number of rows * ( limit – low_value) / ( high_value – low_value )
cardinality = 10000 * ( 501 – 1 ) / ( 10988 – 1 ) = 455.08
 
so if the application requires the frequent truncate and reload, then ensure that either the reload is done as part of direct path loads or if it is a conventional path load have them stats regathered manually at the end of the load to prevent the suboptimal plans post the reload.

No comments:

Post a Comment