In this blog post we are going to see
about the effect of real time statistics feature with table truncate and reload
operations.
demo@PDB19> show parameter
optimizer_real_time_statistics
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_real_time_statistics boolean FALSE
demo@PDB19> alter session set optimizer_real_time_statistics=true;
Session altered.
demo@PDB19> create table
t(
2 n1 number primary key,
3 n2 number ,
4 n3 varchar2(80) );
Table created.
demo@PDB19> insert /*+
append */ into t(n1,n2,n3)
2 with rws as ( select rownum n from dual connect by level <= 1000 )
3 select n, ceil(sqrt(n)),
4 to_char(to_date(n,'j'),'jsp')
5 from rws;
1000 rows created.
demo@PDB19> commit;
Commit complete.
demo@PDB19> truncate table
t;
Table truncated.
That above insert statement was a direct
path insert – which led to basic statistics gathered on the table – however
this truncate statement removed the data from the table but not its statistics
demo@PDB19> select
num_rows,blocks,monitoring
2 from user_tables
3 where table_name ='T';
NUM_ROWS
BLOCKS MON
---------- ---------- ---
1000 8 YES
demo@PDB19> select
num_rows,blocks,stale_stats,notes
2 from user_tab_statistics
3 where table_name ='T';
NUM_ROWS
BLOCKS STALE_S NOTES
---------- ---------- ------- -------------------------
1000 8 YES
Now when we reload the data using
conventional path dml, the real-time statistics will kick in and end up like
this
demo@PDB19> insert into
t(n1,n2,n3)
2 with rws as ( select rownum+1000 n from dual connect by level <= 10000 )
3 select n, ceil(sqrt(n)),
4 to_char(to_date(n,'j'),'jsp')
5 from rws;
10000 rows created.
demo@PDB19> commit;
Commit complete.
demo@PDB19> exec
dbms_stats.flush_database_monitoring_info;
PL/SQL procedure successfully
completed.
demo@PDB19> select
num_rows,blocks,stale_stats,notes
2 from user_tab_statistics
3 where table_name ='T';
NUM_ROWS
BLOCKS STALE_S NOTES
---------- ---------- ------- -------------------------
1000 8 YES
10000 73 STATS_ON_CONVENTIONAL_DML
demo@PDB19> select
t1.column_name,t1.num_nulls,t1.num_distinct,t1.low_value,get_stats_val(t2.data_type,t1.low_value)
low_val,
2 t1.high_value,get_stats_val(t2.data_type,t1.high_value) high_val, t1.histogram , t1.notes
3 from user_tab_col_statistics t1,
4 user_tab_columns t2
5 where t1.table_name ='T'
6 and t1.table_name = t2.table_name
7 and t1.column_name = t2.column_name;
COLUMN_NAM NUM_NULLS NUM_DISTINCT LOW_VALUE LOW_VAL
HIGH_VALUE HIGH_VAL
HISTOGRAM NOTES
---------- ---------- ------------ ---------- ---------- ---------- -------------------- --------------- -------------------------
N1 0 1000 C102 1 C20B 1000 NONE STATS_ON_LOAD
N2 0 32 C102 1 C121 32 NONE STATS_ON_LOAD
N3 0 1000 6569676874 eight 74776F2068 two hundred two NONE STATS_ON_LOAD
756E647265
642074776F
N1 0 C102 1 C3020A59 10988
STATS_ON_CONVENTIONAL_DML
N2 0 C102 1 C20206 105 STATS_ON_CONVENTIONAL_DML
N3 0 6569676874 eight 74776F2074 two thousand two hun STATS_ON_CONVENTIONAL_DML
686F757361 dred twenty-six
6E64207477
6F2068756E
6472656420
7477656E74
792D736978
The base statistics says we got 1000 rows
initially load and then real-time statistics says we got 10K rows loaded, now when
we parse a sql with predicate matching the subset of data in the range that got
truncated before returns no rows (because the data in the range of 1 to 1000
got truncated).
demo@PDB19> select /*+
gather_plan_statistics */ count(*) from t where n1 < 501;
COUNT(*)
----------
0
demo@PDB19> select * from
table( dbms_xplan.display_cursor(format=>'allstats last'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
SQL_ID agtrz8n1288u0, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from t where n1 < 501
Plan hash value: 1241160216
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 2 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 2 |
|* 2 | INDEX RANGE SCAN| SYS_C007944 | 1 | 455 | 0 |00:00:00.01 | 2 |
-------------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
2 - access("N1"<501)
Note
-----
- dynamic statistics used: statistics for conventional DML
23 rows selected.
demo@PDB19>
but looking at the column level
statistics it says the column N1 got data in the range of 1 to 11000, so when
we parse the sql, the optimizer took into this column level statistics and
estimated the cardinality as
cardinality = number of rows * ( limit –
low_value) / ( high_value – low_value )
cardinality = 10000 * ( 501 – 1 ) / ( 10988 – 1 ) = 455.08
so if the application requires the
frequent truncate and reload, then ensure that either the reload is done as
part of direct path loads or if it is a conventional path load have them stats
regathered manually at the end of the load to prevent the suboptimal plans post
the reload.
------------------------------------ ----------- ------------------------------
optimizer_real_time_statistics boolean FALSE
demo@PDB19> alter session set optimizer_real_time_statistics=true;
2 n1 number primary key,
3 n2 number ,
4 n3 varchar2(80) );
2 with rws as ( select rownum n from dual connect by level <= 1000 )
3 select n, ceil(sqrt(n)),
4 to_char(to_date(n,'j'),'jsp')
5 from rws;
2 from user_tables
3 where table_name ='T';
---------- ---------- ---
1000 8 YES
2 from user_tab_statistics
3 where table_name ='T';
---------- ---------- ------- -------------------------
1000 8 YES
2 with rws as ( select rownum+1000 n from dual connect by level <= 10000 )
3 select n, ceil(sqrt(n)),
4 to_char(to_date(n,'j'),'jsp')
5 from rws;
2 from user_tab_statistics
3 where table_name ='T';
---------- ---------- ------- -------------------------
1000 8 YES
10000 73 STATS_ON_CONVENTIONAL_DML
2 t1.high_value,get_stats_val(t2.data_type,t1.high_value) high_val, t1.histogram , t1.notes
3 from user_tab_col_statistics t1,
4 user_tab_columns t2
5 where t1.table_name ='T'
6 and t1.table_name = t2.table_name
7 and t1.column_name = t2.column_name;
---------- ---------- ------------ ---------- ---------- ---------- -------------------- --------------- -------------------------
N1 0 1000 C102 1 C20B 1000 NONE STATS_ON_LOAD
N2 0 32 C102 1 C121 32 NONE STATS_ON_LOAD
N3 0 1000 6569676874 eight 74776F2068 two hundred two NONE STATS_ON_LOAD
756E647265
642074776F
N2 0 C102 1 C20206 105 STATS_ON_CONVENTIONAL_DML
N3 0 6569676874 eight 74776F2074 two thousand two hun STATS_ON_CONVENTIONAL_DML
686F757361 dred twenty-six
6E64207477
6F2068756E
6472656420
7477656E74
792D736978
----------
0
--------------------------------------------------------------------------------------
SQL_ID agtrz8n1288u0, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from t where n1 < 501
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 2 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 2 |
|* 2 | INDEX RANGE SCAN| SYS_C007944 | 1 | 455 | 0 |00:00:00.01 | 2 |
-------------------------------------------------------------------------------------------
---------------------------------------------------
-----
- dynamic statistics used: statistics for conventional DML
cardinality = 10000 * ( 501 – 1 ) / ( 10988 – 1 ) = 455.08
No comments:
Post a Comment