THE RAJESH BLOG: Real Time statistics

In this blog post we are going to see about the effect of real time statistics feature with table truncate and reload operations.

demo@PDB19> show parameter optimizer_real_time_statistics

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_real_time_statistics       boolean     FALSE
demo@PDB19> alter session set optimizer_real_time_statistics=true;

Session altered.

demo@PDB19> create table t(
2     n1 number primary key,
3     n2 number ,
4     n3 varchar2(80) );

Table created.

demo@PDB19> insert /*+ append */ into t(n1,n2,n3)
2 with rws as ( select rownum n from dual connect by level <= 1000 )
3 select n, ceil(sqrt(n)),
4     to_char(to_date(n,'j'),'jsp')
5 from rws;

1000 rows created.

demo@PDB19> commit;

Commit complete.

demo@PDB19> truncate table t;

Table truncated.

That above insert statement was a direct path insert – which led to basic statistics gathered on the table – however this truncate statement removed the data from the table but not its statistics

demo@PDB19> select num_rows,blocks,monitoring
2 from user_tables
3 where table_name ='T';

NUM_ROWS     BLOCKS MON
---------- ---------- ---
      1000          8 YES

demo@PDB19> select num_rows,blocks,stale_stats,notes
2 from user_tab_statistics
3 where table_name ='T';

NUM_ROWS     BLOCKS STALE_S NOTES
---------- ---------- ------- -------------------------
      1000          8 YES

Now when we reload the data using conventional path dml, the real-time statistics will kick in and end up like this

demo@PDB19> insert into t(n1,n2,n3)
2 with rws as ( select rownum+1000 n from dual connect by level <= 10000 )
3 select n, ceil(sqrt(n)),
4     to_char(to_date(n,'j'),'jsp')
5 from rws;

10000 rows created.

demo@PDB19> commit;

Commit complete.

demo@PDB19> exec dbms_stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.

demo@PDB19> select num_rows,blocks,stale_stats,notes
2 from user_tab_statistics
3 where table_name ='T';

NUM_ROWS     BLOCKS STALE_S NOTES
---------- ---------- ------- -------------------------
      1000          8 YES
     10000         73         STATS_ON_CONVENTIONAL_DML

demo@PDB19> select t1.column_name,t1.num_nulls,t1.num_distinct,t1.low_value,get_stats_val(t2.data_type,t1.low_value) low_val,
2      t1.high_value,get_stats_val(t2.data_type,t1.high_value) high_val, t1.histogram , t1.notes
3 from user_tab_col_statistics t1,
4       user_tab_columns t2
5 where t1.table_name ='T'
6 and t1.table_name = t2.table_name
7 and t1.column_name = t2.column_name;

COLUMN_NAM NUM_NULLS NUM_DISTINCT LOW_VALUE LOW_VAL    HIGH_VALUE HIGH_VAL             HISTOGRAM       NOTES
---------- ---------- ------------ ---------- ---------- ---------- -------------------- --------------- -------------------------
N1                  0         1000 C102       1          C20B       1000                 NONE            STATS_ON_LOAD
N2                  0           32 C102       1          C121       32                   NONE            STATS_ON_LOAD
N3                  0         1000 6569676874 eight      74776F2068 two hundred two      NONE            STATS_ON_LOAD
                                                         756E647265
                                                         642074776F

N1                  0              C102       1          C3020A59   10988                                STATS_ON_CONVENTIONAL_DML
N2                  0              C102       1          C20206     105                                  STATS_ON_CONVENTIONAL_DML
N3                  0              6569676874 eight      74776F2074 two thousand two hun                 STATS_ON_CONVENTIONAL_DML
                                                         686F757361 dred twenty-six
                                                         6E64207477
                                                         6F2068756E
                                                         6472656420
                                                         7477656E74
                                                         792D736978

The base statistics says we got 1000 rows initially load and then real-time statistics says we got 10K rows loaded, now when we parse a sql with predicate matching the subset of data in the range that got truncated before returns no rows (because the data in the range of 1 to 1000 got truncated).

demo@PDB19> select /*+ gather_plan_statistics */ count(*) from t where n1 < 501;

COUNT(*)
----------
         0

demo@PDB19> select * from table( dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
SQL_ID agtrz8n1288u0, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from t where n1 < 501

Plan hash value: 1241160216

-------------------------------------------------------------------------------------------
| Id | Operation         | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |             |      1 |        |      1 |00:00:00.01 |       2 |
|   1 | SORT AGGREGATE   |             |      1 |      1 |      1 |00:00:00.01 |       2 |
|* 2 |   INDEX RANGE SCAN| SYS_C007944 |      1 |    455 |      0 |00:00:00.01 |       2 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("N1"<501)

Note
-----
   - dynamic statistics used: statistics for conventional DML

23 rows selected.

demo@PDB19>

but looking at the column level statistics it says the column N1 got data in the range of 1 to 11000, so when we parse the sql, the optimizer took into this column level statistics and estimated the cardinality as

cardinality = number of rows * ( limit – low_value) / ( high_value – low_value )
cardinality = 10000 * ( 501 – 1 ) / ( 10988 – 1 ) = 455.08

so if the application requires the frequent truncate and reload, then ensure that either the reload is done as part of direct path loads or if it is a conventional path load have them stats regathered manually at the end of the load to prevent the suboptimal plans post the reload.

Monday, January 17, 2022

Real Time statistics - Part VII

No comments:

Post a Comment