THE RAJESH BLOG: January 2022

Thursday, January 27, 2022

Real Time statistics - Part IX

In this blog post we are going to see about the effect of real time statistics feature with locking statistics in place.

demo@PDB19> show parameter optimizer_real_time_statistics

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_real_time_statistics       boolean     FALSE
demo@PDB19> alter session set optimizer_real_time_statistics=true;

Session altered.
demo@PDB19> create table t(
2     n1 number primary key,
3     n2 number ,
4     n3 varchar2(80) );

Table created.

demo@PDB19> insert /*+ append */ into t(n1,n2,n3)
2 with rws as ( select rownum n from dual connect by level <= 1000 )
3 select n, ceil(sqrt(n)),
4     to_char(to_date(n,'j'),'jsp')
5 from rws;

1000 rows created.

demo@PDB19> commit;

Commit complete.

demo@PDB19> exec dbms_stats.lock_table_stats(user,'T');

PL/SQL procedure successfully completed.

demo@PDB19> insert into t(n1,n2,n3)
2 with rws as ( select rownum+1000 n from dual connect by level <= 1000 )
3 select n, ceil(sqrt(n)),
4     to_char(to_date(n,'j'),'jsp')
5 from rws;

1000 rows created.

demo@PDB19> commit;

Commit complete.

demo@PDB19> select num_rows,blocks,monitoring
2 from user_tables
3 where table_name ='T';

NUM_ROWS     BLOCKS MON
---------- ---------- ---
      1000          8 YES

demo@PDB19> select num_rows,blocks,stale_stats,notes
2 from user_tab_statistics
3 where table_name ='T';

NUM_ROWS     BLOCKS STALE_S NOTES
---------- ---------- ------- -------------------------
      1000          8 YES

demo@PDB19> select column_name,num_nulls,num_distinct,low_value,get_stats_val(data_type,low_value) low_val,
2      high_value,get_stats_val(data_type,high_value) high_val, histogram
3 from user_tab_columns
4 where table_name ='T';

COLUMN_NAM NUM_NULLS NUM_DISTINCT LOW_VALUE LOW_VAL    HIGH_VALUE HIGH_VAL             HISTOGRAM
---------- ---------- ------------ ---------- ---------- ---------- -------------------- ---------------
N1                  0         1000 C102       1          C20B       1000                 NONE
N2                  0           32 C102       1          C121       32                   NONE
N3                  0         1000 6569676874 eight      74776F2068 two hundred two      NONE
                                                         756E647265
                                                         642074776F

demo@PDB19> select t1.column_name,t1.num_nulls,t1.num_distinct,t1.low_value,get_stats_val(t2.data_type,t1.low_value) low_val,
2      t1.high_value,get_stats_val(t2.data_type,t1.high_value) high_val, t1.histogram , t1.notes
3 from user_tab_col_statistics t1,
4       user_tab_columns t2
5 where t1.table_name ='T'
6 and t1.table_name = t2.table_name
7 and t1.column_name = t2.column_name;

COLUMN_NAM NUM_NULLS NUM_DISTINCT LOW_VALUE LOW_VAL    HIGH_VALUE HIGH_VAL             HISTOGRAM       NOTES
---------- ---------- ------------ ---------- ---------- ---------- -------------------- --------------- -------------------------
N1                  0         1000 C102       1          C20B       1000                 NONE            STATS_ON_LOAD
N2                  0           32 C102       1          C121       32                   NONE            STATS_ON_LOAD
N3                  0         1000 6569676874 eight      74776F2068 two hundred two      NONE            STATS_ON_LOAD
                                                         756E647265
                                                         642074776F

demo@PDB19>

locking the statistics will prevent the collection of statistics in schema/database wide operations, also it prevents the collection of real-time statistics during conventional path loads.

Sunday, January 23, 2022

Real Time statistics - Part VIII

In this blog post we are going to see about the effect of real time statistics feature with global temporary tables (GTT) in place.

demo@PDB19> create global temporary table t(
2     n1 number primary key,
3     n2 number ,
4     n3 varchar2(80) )
5 on commit preserve rows    ;

Table created.

demo@PDB19> insert /*+ append */ into t(n1,n2,n3)
2 with rws as ( select rownum n from dual connect by level <= 1000 )
3 select n, ceil(sqrt(n)),
4     to_char(to_date(n,'j'),'jsp')
5 from rws;

1000 rows created.

demo@PDB19> commit;

Commit complete.

demo@PDB19> insert into t(n1,n2,n3)
2 with rws as ( select rownum+1000 n from dual connect by level <= 1000 )
3 select n, ceil(sqrt(n)),
4     to_char(to_date(n,'j'),'jsp')
5 from rws;

1000 rows created.

demo@PDB19> commit;

Commit complete.

demo@PDB19> exec dbms_stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.

demo@PDB19> select num_rows,blocks,monitoring
2 from user_tables
3 where table_name ='T';

NUM_ROWS     BLOCKS MON
---------- ---------- ---
                      NO

demo@PDB19> select num_rows,blocks,stale_stats,notes
2 from user_tab_statistics
3 where table_name ='T';

NUM_ROWS     BLOCKS STALE_S NOTES
---------- ---------- ------- -------------------------

      1000          5

demo@PDB19> select column_name,num_nulls,num_distinct,low_value,get_stats_val(data_type,low_value) low_val,
2      high_value,get_stats_val(data_type,high_value) high_val, histogram
3 from user_tab_columns
4 where table_name ='T';

COLUMN_NAM NUM_NULLS NUM_DISTINCT LOW_VALUE LOW_VAL    HIGH_VALUE HIGH_VAL             HISTOGRAM
---------- ---------- ------------ ---------- ---------- ---------- -------------------- ---------------
N1                                                                                       NONE
N2                                                                                       NONE
N3                                                                                       NONE

demo@PDB19>
demo@PDB19> select t1.column_name,t1.num_nulls,t1.num_distinct,t1.low_value,get_stats_val(t2.data_type,t1.low_value) low_val,
2      t1.high_value,get_stats_val(t2.data_type,t1.high_value) high_val, t1.histogram , t1.notes
3 from user_tab_col_statistics t1,
4       user_tab_columns t2
5 where t1.table_name ='T'
6 and t1.table_name = t2.table_name
7 and t1.column_name = t2.column_name;

COLUMN_NAM NUM_NULLS NUM_DISTINCT LOW_VALUE LOW_VAL    HIGH_VALUE HIGH_VAL             HISTOGRAM       NOTES
---------- ---------- ------------ ---------- ---------- ---------- -------------------- --------------- -------------------------
N1                  0         1000 C102       1          C20B       1000                 NONE
N2                  0           32 C102       1          C121       32                   NONE
N3                  0         1000 6569676874 eight      74776F2068 two hundred two      NONE
                                                         756E647265
                                                         642074776F

demo@PDB19>

the concept of real-time statistics does not fit with GTT. Statistics on GTT have always been problematic and statistics gathered on one session can be used by the sql’s parsing in another session, to solve this problem Oracle 12c introduced a concept called session level statistics on GTT

Monday, January 17, 2022

Real Time statistics - Part VII

In this blog post we are going to see about the effect of real time statistics feature with table truncate and reload operations.

demo@PDB19> show parameter optimizer_real_time_statistics

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_real_time_statistics       boolean     FALSE
demo@PDB19> alter session set optimizer_real_time_statistics=true;

Session altered.

demo@PDB19> create table t(
2     n1 number primary key,
3     n2 number ,
4     n3 varchar2(80) );

Table created.

demo@PDB19> insert /*+ append */ into t(n1,n2,n3)
2 with rws as ( select rownum n from dual connect by level <= 1000 )
3 select n, ceil(sqrt(n)),
4     to_char(to_date(n,'j'),'jsp')
5 from rws;

1000 rows created.

demo@PDB19> commit;

Commit complete.

demo@PDB19> truncate table t;

Table truncated.

That above insert statement was a direct path insert – which led to basic statistics gathered on the table – however this truncate statement removed the data from the table but not its statistics

demo@PDB19> select num_rows,blocks,monitoring
2 from user_tables
3 where table_name ='T';

NUM_ROWS     BLOCKS MON
---------- ---------- ---
      1000          8 YES

demo@PDB19> select num_rows,blocks,stale_stats,notes
2 from user_tab_statistics
3 where table_name ='T';

NUM_ROWS     BLOCKS STALE_S NOTES
---------- ---------- ------- -------------------------
      1000          8 YES

Now when we reload the data using conventional path dml, the real-time statistics will kick in and end up like this

demo@PDB19> insert into t(n1,n2,n3)
2 with rws as ( select rownum+1000 n from dual connect by level <= 10000 )
3 select n, ceil(sqrt(n)),
4     to_char(to_date(n,'j'),'jsp')
5 from rws;

10000 rows created.

demo@PDB19> commit;

Commit complete.

demo@PDB19> exec dbms_stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.

demo@PDB19> select num_rows,blocks,stale_stats,notes
2 from user_tab_statistics
3 where table_name ='T';

NUM_ROWS     BLOCKS STALE_S NOTES
---------- ---------- ------- -------------------------
      1000          8 YES
     10000         73         STATS_ON_CONVENTIONAL_DML

demo@PDB19> select t1.column_name,t1.num_nulls,t1.num_distinct,t1.low_value,get_stats_val(t2.data_type,t1.low_value) low_val,
2      t1.high_value,get_stats_val(t2.data_type,t1.high_value) high_val, t1.histogram , t1.notes
3 from user_tab_col_statistics t1,
4       user_tab_columns t2
5 where t1.table_name ='T'
6 and t1.table_name = t2.table_name
7 and t1.column_name = t2.column_name;

COLUMN_NAM NUM_NULLS NUM_DISTINCT LOW_VALUE LOW_VAL    HIGH_VALUE HIGH_VAL             HISTOGRAM       NOTES
---------- ---------- ------------ ---------- ---------- ---------- -------------------- --------------- -------------------------
N1                  0         1000 C102       1          C20B       1000                 NONE            STATS_ON_LOAD
N2                  0           32 C102       1          C121       32                   NONE            STATS_ON_LOAD
N3                  0         1000 6569676874 eight      74776F2068 two hundred two      NONE            STATS_ON_LOAD
                                                         756E647265
                                                         642074776F

N1                  0              C102       1          C3020A59   10988                                STATS_ON_CONVENTIONAL_DML
N2                  0              C102       1          C20206     105                                  STATS_ON_CONVENTIONAL_DML
N3                  0              6569676874 eight      74776F2074 two thousand two hun                 STATS_ON_CONVENTIONAL_DML
                                                         686F757361 dred twenty-six
                                                         6E64207477
                                                         6F2068756E
                                                         6472656420
                                                         7477656E74
                                                         792D736978

The base statistics says we got 1000 rows initially load and then real-time statistics says we got 10K rows loaded, now when we parse a sql with predicate matching the subset of data in the range that got truncated before returns no rows (because the data in the range of 1 to 1000 got truncated).

demo@PDB19> select /*+ gather_plan_statistics */ count(*) from t where n1 < 501;

COUNT(*)
----------
         0

demo@PDB19> select * from table( dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
SQL_ID agtrz8n1288u0, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from t where n1 < 501

Plan hash value: 1241160216

-------------------------------------------------------------------------------------------
| Id | Operation         | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |             |      1 |        |      1 |00:00:00.01 |       2 |
|   1 | SORT AGGREGATE   |             |      1 |      1 |      1 |00:00:00.01 |       2 |
|* 2 |   INDEX RANGE SCAN| SYS_C007944 |      1 |    455 |      0 |00:00:00.01 |       2 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("N1"<501)

Note
-----
   - dynamic statistics used: statistics for conventional DML

23 rows selected.

demo@PDB19>

but looking at the column level statistics it says the column N1 got data in the range of 1 to 11000, so when we parse the sql, the optimizer took into this column level statistics and estimated the cardinality as

cardinality = number of rows * ( limit – low_value) / ( high_value – low_value )
cardinality = 10000 * ( 501 – 1 ) / ( 10988 – 1 ) = 455.08

so if the application requires the frequent truncate and reload, then ensure that either the reload is done as part of direct path loads or if it is a conventional path load have them stats regathered manually at the end of the load to prevent the suboptimal plans post the reload.

Friday, January 14, 2022

Real Time statistics - Part VI

In the previous blog post we discussed about how lone deletes are not tracked by real-time statistics, however there is an exception to that. Deletes are tracked by real-time statistics if they were followed by an update statement.

demo@PDB19> show parameter optimizer_real_time_statistics

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_real_time_statistics       boolean     FALSE
demo@PDB19> alter session set optimizer_real_time_statistics=true;

Session altered.

demo@PDB19> create table t(
2     n1 number primary key,
3     n2 number ,
4     n3 varchar2(80) );

Table created.

demo@PDB19> insert /*+ append */ into t(n1,n2,n3)
2 with rws as ( select rownum n from dual connect by level <= 1000 )
3 select n, ceil(sqrt(n)),
4     to_char(to_date(n,'j'),'jsp')
5 from rws;

1000 rows created.

demo@PDB19> commit;

Commit complete.

demo@PDB19> update t
2 set n2 = ceil(sqrt(n1+1000)) ,
3      n3 = 'z'||to_char(to_date(n1+1000,'j'),'jsp')
4 where n1 <= 500;

500 rows updated.

demo@PDB19> commit;

Commit complete.

demo@PDB19> delete from t where n1 > 800;

200 rows deleted.

demo@PDB19> commit;

Commit complete.

The reduction in the number of rows by this delete command was tracked by real-time statistics.

demo@PDB19> exec dbms_stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.

demo@PDB19> select num_rows,blocks,monitoring
2 from user_tables
3 where table_name ='T';

NUM_ROWS     BLOCKS MON
---------- ---------- ---
      1000          8 YES

demo@PDB19> select num_rows,blocks,stale_stats,notes
2 from user_tab_statistics
3 where table_name ='T';

NUM_ROWS     BLOCKS STALE_S NOTES
---------- ---------- ------- -------------------------
      1000          8 YES
       800         16         STATS_ON_CONVENTIONAL_DML

However looking at the column level statistics, no real-time statistics were observed for the column N1, since it was not updated.

demo@PDB19> select t1.column_name,t1.num_nulls,t1.num_distinct,t1.low_value,get_stats_val(t2.data_type,t1.low_value) low_val,
2      t1.high_value,get_stats_val(t2.data_type,t1.high_value) high_val, t1.histogram , t1.notes
3 from user_tab_col_statistics t1,
4       user_tab_columns t2
5 where t1.table_name ='T'
6 and t1.table_name = t2.table_name
7 and t1.column_name = t2.column_name;

COLUMN_NAM NUM_NULLS NUM_DISTINCT LOW_VALUE LOW_VAL    HIGH_VALUE HIGH_VAL             HISTOGRAM NOTES
---------- ---------- ------------ ---------- ---------- ---------- -------------------- ---------- -------------------------
N1                  0         1000 C102       1          C20B       1000                 NONE       STATS_ON_LOAD
N2                  0           32 C102       1          C121       32                   NONE       STATS_ON_LOAD
N3                  0         1000 6569676874 eight      74776F2068 two hundred two      NONE       STATS_ON_LOAD
                                                         756E647265
                                                         642074776F

N2                  0              C102       1          C128       39                              STATS_ON_CONVENTIONAL_DML
N3                  0              6569676874 eight      7A6F6E6520 zone thousand twenty            STATS_ON_CONVENTIONAL_DML
                                                         74686F7573 -nine
                                                         616E642074
                                                         77656E7479
                                                         2D6E696E65

Now parsing a sql having a predicate on the column N1 - that was part of our “Delete” statement, but don’t have real-time column level statistics on it yet - produces a plan like this.

demo@PDB19> select /*+ gather_plan_statistics */ count(*) from t where n1 > 800;

COUNT(*)
----------
         0

demo@PDB19> select * from table( dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
SQL_ID cfbuu2gdzj40k, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from t where n1 > 800

Plan hash value: 496403668

-------------------------------------------------------------------------------------------
| Id | Operation         | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |             |      1 |        |      1 |00:00:00.01 |       2 |
|   1 | SORT AGGREGATE   |             |      1 |      1 |      1 |00:00:00.01 |       2 |
|* 2 |   INDEX RANGE SCAN| SYS_C007942 |      1 |    160 |      0 |00:00:00.01 |       2 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("N1">800)

Note
-----
   - dynamic statistics used: statistics for conventional DML

The plan estimates were very way off from the actuals - were 160 rows – that is because the predicate N1 got low/high values as 1 and 1000 and the estimated cardinality is calculated as

Estimated cardinality = number of rows * ( high_value – limit / high_value – low_value )
Estimated cardinality = 800 * ( (1000 – 800 ) / (1000 – 1 ) ) = 160.16

So real-time table level statistics were tracked for those delete statement followed by an update statement, but no real time column level statistics for the predicates from delete statement. This lack of column level statistics could potentially lead to sub-optimal plans.

Books I follow...

Advanced SQL Functions in Oracle 10g - by Richard Walsh Earp
Cost-Based Oracle Fundamentals - By Jonathan Lewis
Effective Oracle by Design - by Thomas kyte
Expert One on One Oracle - by Thomas kyte
Expert Oracle - by Thomas kyte
Expert Oracle Database Architecture Oracle Database 9i, 10g - Thomas Kyte
Expert Oracle Database Architecture Oracle Database 9i, 10g, and 11g Programming Techniques and Solutions - Thomas Kyte
Expert Oracle Practices Oracle Database Administration from the Oak Table
High-Performance Oracle Database Applications - by Donald K. Burleson
Mastering Oracle PL/SQL: Practical Solutions - by Connor McDonald
Mastering Oracle Scheduler in Oracle 11g Databases - by Ronald Rood
Mastering Oracle SQL and SQL*Plus - By LEX DE HAAN
Optimizing Oracle Performance - By Cary Millsap
Oracle Database 10g Performance Tuning Tips & Techniques - By Richard J. Niemiec
Oracle Database 10g XML & SQL: Design, Build & Manage XML Applications in Java, C, C++ & PL/SQL - by Mark V. Scardina, Ben Chang and Jinyu Wang
Oracle Database 10g: The Complete Reference - by Kevin Loney
Oracle Database 11g DBA Handbook - by Bob Bryla , Kevin Loney
Oracle Database 11g New Features - Maximize the new capabilities of the Latest Database Release - by Robert G. Freeman
Oracle Performance Tuning for 10gR2 - By Gavin Powell
Oracle PL/SQL for DBAs By Steven Feuerstein, Arup Nanda
Oracle SQL*Plus: The Definitive Guide - By Jonathan Gennick
Oracle Wait Interface: A Practical Guide to Performance Diagnostics & Tuning by Richmond Shee, Kirtikumar Deshpande and K Gopalakrishnan
Oracle8i Internal Services for Waits, Latches, Locks, and Memory - by Steve Adams
Professional Oracle 8i Application Programming - By Thomas Kyte
Secrets of the Oracle Database - By Norbert Debes
Troubleshooting Oracle Performance - By Christian Antognini
Tuning Third-party Vendor Oracle systems Tuning when you can't touch the code - By Mike Ault
Using Oracle SQL Stored Outlines & Optimizer Plan Stability - By Mike Ault