Having discussed enough about real time
statistics in the previous blog post, it is the time to look about mixed
workload in the context of real time statistics.
demo@PDB19>
show parameter optimizer_real_time_statistics
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_real_time_statistics boolean FALSE
demo@PDB19> alter session set optimizer_real_time_statistics=true;
Session altered.
demo@PDB19> create table t(
2 n1 number primary key,
3 n2 number ,
4 n3 varchar2(80) );
Table created.
demo@PDB19> insert /*+ append */ into t(n1,n2,n3)
2 with rws as ( select rownum n from dual connect by level <= 1000 )
3 select n, ceil(sqrt(n)),
4 to_char(to_date(n,'j'),'jsp')
5 from rws;
1000 rows
created.
demo@PDB19>
commit;
Commit complete.
We got an initial table created and did
some direct path load on it, to get the basic statistics available.
demo@PDB19>
select num_rows,blocks,stale_stats,notes
2 from user_tab_statistics
3 where table_name ='T';
NUM_ROWS
BLOCKS STALE_S NOTES
---------- ---------- ------- -------------------------
1000 8 NO
Did some conventional path dml, to get
the real-time statistics to kick in.
demo@PDB19>
insert into t(n1,n2,n3)
2 with rws as ( select rownum+1000 n from dual connect by level <= 10000 )
3 select n, ceil(sqrt(n)),
4 to_char(to_date(n,'j'),'jsp')
5 from rws;
10000 rows
created.
demo@PDB19>
commit;
Commit complete.
demo@PDB19>
select num_rows,blocks,monitoring
2 from user_tables
3 where table_name ='T';
NUM_ROWS
BLOCKS MON
---------- ---------- ---
1000 8 YES
demo@PDB19>
select num_rows,blocks,stale_stats,notes
2 from user_tab_statistics
3 where table_name ='T';
NUM_ROWS
BLOCKS STALE_S NOTES
---------- ---------- ------- -------------------------
1000 8 YES
demo@PDB19>
select column_name,num_nulls,num_distinct,low_value,high_value
2 from user_tab_columns
3 where table_name ='T';
COLUMN_NAM NUM_NULLS NUM_DISTINCT LOW_VALUE HIGH_VALUE
---------- ---------- ------------ ---------- ----------
N1 0 1000 C102 C20B
N2 0 32 C102 C121
N3 0 1000 6569676874 74776F2068
756E647265
642074776F
demo@PDB19> select column_name,num_nulls,num_distinct,low_value,high_value,notes
2 from user_tab_col_statistics
3 where table_name ='T';
COLUMN_NAM NUM_NULLS NUM_DISTINCT LOW_VALUE HIGH_VALUE NOTES
---------- ---------- ------------ ---------- ---------- -------------------------
N1 0 1000 C102 C20B STATS_ON_LOAD
N2 0 32 C102 C121 STATS_ON_LOAD
N3 0 1000 6569676874 74776F2068 STATS_ON_LOAD
756E647265
642074776F
N1 0 C102
C3020A02 STATS_ON_CONVENTIONAL_DML
N2 0 C102 C20206 STATS_ON_CONVENTIONAL_DML
N3 0 6569676874 74776F2074 STATS_ON_CONVENTIONAL_DML
686F757361
6E64207477
6F2068756E
6472656420
6569676874
So without flushing database monitoring
info we got only the additional column level stats, but no information about
the table level stats available yet.
demo@PDB19>
set serveroutput off
demo@PDB19> select /*+ gather_plan_statistics */ count(*) from t where n2 <= 42;
COUNT(*)
----------
1764
demo@PDB19>
select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
SQL_ID 99qz8xwvgmh1s, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from t where n2 <= 42
Plan hash value:
2966233522
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 75 | 5 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 75 | 5 |
|* 2 | TABLE ACCESS STORAGE FULL| T | 1 | 425 | 1764 |00:00:00.01 | 75 | 5 |
------------------------------------------------------------------------------------------------------
Predicate
Information (identified by operation id):
---------------------------------------------------
2 -
storage("N2"<=42)
filter("N2"<=42)
Note
-----
- dynamic statistics used: statistics for conventional DML
24 rows selected.
Column stats for conventional dml, but
number of rows on table wrong. The above sql got optimized using real time statistics
available at the column level stats.
The column N2 got low value and high
value recorded as C102 and C20206, which
translates to the numerical value of 1 and 105.
demo@XEPDB1>
variable n number
demo@XEPDB1> exec dbms_stats.convert_raw_value('C102',:n);
PL/SQL procedure
successfully completed.
demo@XEPDB1>
print n
N
----------
1
demo@XEPDB1>
exec dbms_stats.convert_raw_value('C20206',:n);
PL/SQL procedure
successfully completed.
demo@XEPDB1>
print n
N
----------
105
The value of 425 reported as cardinality
in the above plan is due to mix of table and column level statistics.
Cardinality = Selectivity * Num of rows
Selectivity = required range / available range + 1 /NDV (since it got unbounded, closed)
Selectivity = (42 – 1) / (105 -1) + 1/32
= 0 .425480769
Cardinality = 0 .425480769 * 1000 = 425.48
We could wait for the monitoring
information to flush naturally, or we can flush it manually.
demo@PDB19>
exec dbms_stats.flush_database_monitoring_info;
PL/SQL procedure
successfully completed.
demo@PDB19>
select num_rows,blocks,monitoring
2 from user_tables
3 where table_name ='T';
NUM_ROWS
BLOCKS MON
---------- ---------- ---
1000 8 YES
demo@PDB19>
demo@PDB19> select num_rows,blocks,stale_stats,notes
2 from user_tab_statistics
3 where table_name ='T';
NUM_ROWS
BLOCKS STALE_S NOTES
---------- ---------- ------- -------------------------
1000 8 YES
11000 76 STATS_ON_CONVENTIONAL_DML
demo@PDB19>
select column_name,num_nulls,num_distinct,low_value,high_value
2 from user_tab_columns
3 where table_name ='T';
COLUMN_NAM NUM_NULLS NUM_DISTINCT LOW_VALUE HIGH_VALUE
---------- ---------- ------------ ---------- ----------
N1 0 1000 C102 C20B
N2 0 32 C102 C121
N3 0 1000 6569676874 74776F2068
756E647265
642074776F
demo@PDB19>
select column_name,num_nulls,num_distinct,low_value,high_value,notes
2 from user_tab_col_statistics
3 where table_name ='T';
COLUMN_NAM NUM_NULLS NUM_DISTINCT LOW_VALUE HIGH_VALUE NOTES
---------- ---------- ------------ ---------- ---------- -------------------------
N1 0 1000 C102 C20B STATS_ON_LOAD
N2 0 32 C102 C121 STATS_ON_LOAD
N3 0 1000 6569676874 74776F2068 STATS_ON_LOAD
756E647265
642074776F
N1 0 C102 C3020A02 STATS_ON_CONVENTIONAL_DML
N2 0 C102 C20206 STATS_ON_CONVENTIONAL_DML
N3 0 6569676874 74776F2074 STATS_ON_CONVENTIONAL_DML
686F757361
6E64207477
6F2068756E
6472656420
6569676874
Now the sql against this fresh set of
statistics produced new estimates.
demo@PDB19>
select /*+ gather_plan_statistics */ count(*) from t t1 where n2 <= 42;
COUNT(*)
----------
1764
demo@PDB19>
select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
SQL_ID f7d7dtpdauwcb, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from t t1 where n2 <= 42
Plan hash value:
2966233522
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 75 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 75 |
|* 2 | TABLE ACCESS STORAGE FULL| T | 1 | 4680 | 1764 |00:00:00.01 | 75 |
---------------------------------------------------------------------------------------------
Predicate
Information (identified by operation id):
---------------------------------------------------
2 - storage("N2"<=42)
filter("N2"<=42)
Note
-----
- dynamic statistics used: statistics for conventional DML
Creation of real-time statistics does not
invalidate cursors. Application requires to reparse the new sql’s, still there
are several opportunities for execution plan to change as the state of
statistics change.
Stale statistics – underestimate rows.
Real time column statistics – still underestimate rows
Real time table statistics – over estimate rows.
------------------------------------ ----------- ------------------------------
optimizer_real_time_statistics boolean FALSE
demo@PDB19> alter session set optimizer_real_time_statistics=true;
demo@PDB19> create table t(
2 n1 number primary key,
3 n2 number ,
4 n3 varchar2(80) );
demo@PDB19> insert /*+ append */ into t(n1,n2,n3)
2 with rws as ( select rownum n from dual connect by level <= 1000 )
3 select n, ceil(sqrt(n)),
4 to_char(to_date(n,'j'),'jsp')
5 from rws;
2 from user_tab_statistics
3 where table_name ='T';
---------- ---------- ------- -------------------------
1000 8 NO
2 with rws as ( select rownum+1000 n from dual connect by level <= 10000 )
3 select n, ceil(sqrt(n)),
4 to_char(to_date(n,'j'),'jsp')
5 from rws;
2 from user_tables
3 where table_name ='T';
---------- ---------- ---
1000 8 YES
2 from user_tab_statistics
3 where table_name ='T';
---------- ---------- ------- -------------------------
1000 8 YES
2 from user_tab_columns
3 where table_name ='T';
---------- ---------- ------------ ---------- ----------
N1 0 1000 C102 C20B
N2 0 32 C102 C121
N3 0 1000 6569676874 74776F2068
756E647265
642074776F
demo@PDB19> select column_name,num_nulls,num_distinct,low_value,high_value,notes
2 from user_tab_col_statistics
3 where table_name ='T';
---------- ---------- ------------ ---------- ---------- -------------------------
N1 0 1000 C102 C20B STATS_ON_LOAD
N2 0 32 C102 C121 STATS_ON_LOAD
N3 0 1000 6569676874 74776F2068 STATS_ON_LOAD
756E647265
642074776F
N2 0 C102 C20206 STATS_ON_CONVENTIONAL_DML
N3 0 6569676874 74776F2074 STATS_ON_CONVENTIONAL_DML
686F757361
6E64207477
6F2068756E
6472656420
6569676874
demo@PDB19> select /*+ gather_plan_statistics */ count(*) from t where n2 <= 42;
----------
1764
--------------------------------------------------------------------------------------
SQL_ID 99qz8xwvgmh1s, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from t where n2 <= 42
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 75 | 5 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 75 | 5 |
|* 2 | TABLE ACCESS STORAGE FULL| T | 1 | 425 | 1764 |00:00:00.01 | 75 | 5 |
---------------------------------------------------
filter("N2"<=42)
-----
- dynamic statistics used: statistics for conventional DML
demo@XEPDB1> exec dbms_stats.convert_raw_value('C102',:n);
----------
1
----------
105
Selectivity = required range / available range + 1 /NDV (since it got unbounded, closed)
Cardinality = 0 .425480769 * 1000 = 425.48
2 from user_tables
3 where table_name ='T';
---------- ---------- ---
1000 8 YES
demo@PDB19> select num_rows,blocks,stale_stats,notes
2 from user_tab_statistics
3 where table_name ='T';
---------- ---------- ------- -------------------------
1000 8 YES
11000 76 STATS_ON_CONVENTIONAL_DML
2 from user_tab_columns
3 where table_name ='T';
---------- ---------- ------------ ---------- ----------
N1 0 1000 C102 C20B
N2 0 32 C102 C121
N3 0 1000 6569676874 74776F2068
756E647265
642074776F
2 from user_tab_col_statistics
3 where table_name ='T';
---------- ---------- ------------ ---------- ---------- -------------------------
N1 0 1000 C102 C20B STATS_ON_LOAD
N2 0 32 C102 C121 STATS_ON_LOAD
N3 0 1000 6569676874 74776F2068 STATS_ON_LOAD
756E647265
642074776F
N2 0 C102 C20206 STATS_ON_CONVENTIONAL_DML
N3 0 6569676874 74776F2074 STATS_ON_CONVENTIONAL_DML
686F757361
6E64207477
6F2068756E
6472656420
6569676874
----------
1764
--------------------------------------------------------------------------------------
SQL_ID f7d7dtpdauwcb, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from t t1 where n2 <= 42
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 75 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 75 |
|* 2 | TABLE ACCESS STORAGE FULL| T | 1 | 4680 | 1764 |00:00:00.01 | 75 |
---------------------------------------------------
filter("N2"<=42)
-----
- dynamic statistics used: statistics for conventional DML
Stale statistics – underestimate rows.
Real time column statistics – still underestimate rows
Real time table statistics – over estimate rows.
No comments:
Post a Comment