Monday, May 20, 2019

SQL Tuning by adding column alias


SQL Tuning is not about adding an index to the table or adding hints or rewriting the query, it is all about understanding how oracle works and most importantly how the CBO makes use of the table and column level statistics, histograms, constraints, datatypes, system statistics and other optimizer transformation applied to build an efficient plan.

The below example started from a real project where we are moving an application database running on 11.2.0.4 in HPUX platform to a database 11.2.0.4 running on Exadata platform.

After migration, we found that a delete statement was running for more than 9hours in Exadata that got completed in few secs in non-exadata platform.

Here is a test case, a table with sample data in it.

demo@EXACC> create table t as
  2  select object_id c1,
  3     owner c2,
  4     object_type c3 ,a.*
  5  from all_objects a
  6  where rownum <= 10000;

Table created.

demo@EXACC> insert into t select * from t where rownum <=10;

10 rows created.

demo@EXACC> commit;

Commit complete.

demo@EXACC> exec dbms_stats.gather_table_stats(user,'T');

PL/SQL procedure successfully completed.

The estimated plan for the “delete” statement in Exadata platform was like this.

demo@EXACC> explain plan for
  2  delete from t where rowid in (
  3  select rowid from (
  4  select c1,c3,rowid, rank() over( partition by c1 order by c3 desc) rnk
  5  from t
  6  where c1 in (
  7  select c1 from t
  8  group by c1,c2
  9  having count(*) > 1 )
 10     )
 11  where rnk > 1 ) ;

Explained.

demo@EXACC> @xplan

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
Plan hash value: 3590743719

------------------------------------------------------------------------------------
| Id  | Operation                          | Name     | Rows  | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------
|   0 | DELETE STATEMENT                   |          |     1 |    24 |   362K  (3)|
|   1 |  DELETE                            | T        |       |       |            |
|   2 |   NESTED LOOPS                     |          |     1 |    24 |   362K  (3)|
|   3 |    VIEW                            | VW_NSO_1 |     1 |    12 |   362K  (3)|
|   4 |     SORT UNIQUE                    |          |     1 |    25 |            |
|*  5 |      VIEW                          |          |     1 |    25 |   362K  (3)|
|   6 |       WINDOW SORT                  |          |     1 |    24 |   362K  (3)|
|*  7 |        FILTER                      |          |       |       |            |
|   8 |         TABLE ACCESS STORAGE FULL  | T        | 10010 |   234K|    71   (0)|
|*  9 |         FILTER                     |          |       |       |            |
|  10 |          HASH GROUP BY             |          |     2 |    20 |    72   (2)|
|  11 |           TABLE ACCESS STORAGE FULL| T        | 10010 |    97K|    71   (0)|
|  12 |    TABLE ACCESS BY USER ROWID      | T        |     1 |    12 |     1   (0)|
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   5 - filter("RNK">1)
   7 - filter( EXISTS (SELECT 0 FROM "T" "T" GROUP BY "C1","C2" HAVING "C1"=:B1 AND
              COUNT(*)>1))
   9 - filter("C1"=:B1 AND COUNT(*)>1)

27 rows selected.

The plan shows that for each row we retrieve from table T at the step# 8 , the sub query associated at the step#7 will be executed once  for each row we retrieve from table T.

The actual plan for the “delete” statement in Exadata was like this.

demo@EXACC> set serveroutput off
demo@EXACC> delete /*+ gather_plan_statistics */ from t where rowid in (
  2  select rowid from (
  3  select c1,c3,rowid, rank() over( partition by c1 order by c3 desc) rnk
  4  from t
  5  where c1 in (
  6  select c1 from t
  7  group by c1,c2
  8  having count(*) > 1 )
  9     )
 10  where rnk > 1 ) ;

0 rows deleted.

demo@EXACC> select * from table( dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  39gn43tyrtdh4, child number 0
-------------------------------------
delete /*+ gather_plan_statistics */ from t where rowid in ( select
rowid from ( select c1,c3,rowid, rank() over( partition by c1 order by
c3 desc) rnk from t where c1 in ( select c1 from t group by c1,c2
having count(*) > 1 )  ) where rnk > 1 )

Plan hash value: 3590743719

---------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------
|   0 | DELETE STATEMENT                   |          |      1 |        |      0 |00:00:42.43 |    1740K|
|   1 |  DELETE                            | T        |      1 |        |      0 |00:00:42.43 |    1740K|
|   2 |   NESTED LOOPS                     |          |      1 |      1 |      0 |00:00:42.43 |    1740K|
|   3 |    VIEW                            | VW_NSO_1 |      1 |      1 |      0 |00:00:42.43 |    1740K|
|   4 |     SORT UNIQUE                    |          |      1 |      1 |      0 |00:00:42.43 |    1740K|
|*  5 |      VIEW                          |          |      1 |      1 |      0 |00:00:42.43 |    1740K|
|   6 |       WINDOW SORT                  |          |      1 |      1 |     20 |00:00:42.43 |    1740K|
|*  7 |        FILTER                      |          |      1 |        |     20 |00:00:00.08 |    1740K|
|   8 |         TABLE ACCESS STORAGE FULL  | T        |      1 |  10010 |  10010 |00:00:00.01 |     174 |
|*  9 |         FILTER                     |          |  10000 |        |     10 |00:00:42.41 |    1740K|
|  10 |          HASH GROUP BY             |          |  10000 |      2 |     99M|00:00:41.83 |    1740K|
|  11 |           TABLE ACCESS STORAGE FULL| T        |  10000 |  10010 |    100M|00:00:10.80 |    1740K|
|  12 |    TABLE ACCESS BY USER ROWID      | T        |      0 |      1 |      0 |00:00:00.01 |       0 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - filter("RNK">1)
   7 - filter( IS NOT NULL)
   9 - filter(("C1"=:B1 AND COUNT(*)>1))


34 rows selected.

It is clear that Step# 9,10,11 was executed once for each row we retrieve from Step#8 – and those portions were the top consumers of logical IO’s and elapsed time.

Where as in non-exadata platform the plan produced by the Optimizer was like this:

demo@NON-EXACC> set serveroutput off
demo@NON-EXACC> delete /*+ gather_plan_statistics */ from t where rowid in (
  2  select rowid from (
  3  select c1,c3,rowid, rank() over( partition by c1 order by c3 desc) rnk
  4  from t
  5  where c1 in (
  6  select c1 from t
  7  group by c1,c2
  8  having count(*) > 1 )
  9     )
 10  where rnk > 1 ) ;

0 rows deleted.

demo@NON-EXACC> select * from table( dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------
SQL_ID  39gn43tyrtdh4, child number 0
-------------------------------------
delete /*+ gather_plan_statistics */ from t where rowid in ( select
rowid from ( select c1,c3,rowid, rank() over( partition by c1 order by
c3 desc) rnk from t where c1 in ( select c1 from t group by c1,c2
having count(*) > 1 )  ) where rnk > 1 )

Plan hash value: 1715516498

---------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | DELETE STATEMENT             |          |      1 |        |      0 |00:00:00.01 |     346 |
|   1 |  DELETE                      | T        |      1 |        |      0 |00:00:00.01 |     346 |
|   2 |   NESTED LOOPS               |          |      1 |      1 |      0 |00:00:00.01 |     346 |
|   3 |    VIEW                      | VW_NSO_2 |      1 |    502 |      0 |00:00:00.01 |     346 |
|   4 |     SORT UNIQUE              |          |      1 |      1 |      0 |00:00:00.01 |     346 |
|*  5 |      VIEW                    |          |      1 |    502 |      0 |00:00:00.01 |     346 |
|   6 |       WINDOW SORT            |          |      1 |    502 |     20 |00:00:00.01 |     346 |
|*  7 |        HASH JOIN RIGHT SEMI  |          |      1 |    502 |     20 |00:00:00.01 |     346 |
|   8 |         VIEW                 | VW_NSO_1 |      1 |    501 |     10 |00:00:00.01 |     173 |
|*  9 |          FILTER              |          |      1 |        |     10 |00:00:00.01 |     173 |
|  10 |           SORT GROUP BY      |          |      1 |     26 |  10000 |00:00:00.01 |     173 |
|  11 |            TABLE ACCESS FULL | T        |      1 |  10010 |  10010 |00:00:00.01 |     173 |
|  12 |         TABLE ACCESS FULL    | T        |      1 |  10010 |  10010 |00:00:00.01 |     173 |
|  13 |    TABLE ACCESS BY USER ROWID| T        |      0 |      1 |      0 |00:00:00.01 |       0 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - filter("RNK">1)
   7 - access("C1"="C1")
   9 - filter(COUNT(*)>1)


35 rows selected.


The subquery was transformed as join and executed only once in overall execution.

Upon analyzing why the subquery does not transformed into join in Exadata platform, we took 10053 trace from both the environment for comparison.

10053 trace from non-exadata platform.

*****************************
Cost-Based Subquery Unnesting
*****************************
SU: Unnesting query blocks in query block DEL$1 (#1) that are valid to unnest.
Subquery removal for query block SEL$1 (#2)
RSW: Not valid for subquery removal SEL$1 (#2)
Subquery unchanged.
Subquery removal for query block SEL$3 (#4)
RSW: Not valid for subquery removal SEL$3 (#4)
Subquery unchanged.
Subquery Unnesting on query block SEL$2 (#3)SU: Performing unnesting that does not require costing.
SU: Considering subquery unnest on query block SEL$2 (#3).
SU:   Checking validity of unnesting subquery SEL$3 (#4)
SU:   Passed validity checks, but requires costing.
SU: Using search type: exhaustive
SU: Starting iteration 1, state space = (4) : (1)
SU:   Transform an ANY subquery to semi-join or distinct.

10053 trace from exadata platform.

*****************************
Cost-Based Subquery Unnesting
*****************************
SU: Unnesting query blocks in query block DEL$1 (#1) that are valid to unnest.
Subquery removal for query block SEL$1 (#2)
RSW: Not valid for subquery removal SEL$1 (#2)
Subquery unchanged.
Subquery removal for query block SEL$3 (#4)
RSW: Not valid for subquery removal SEL$3 (#4)
Subquery unchanged.
Subquery Unnesting on query block SEL$2 (#3)SU: Performing unnesting that does not require costing.
SU: Considering subquery unnest on query block SEL$2 (#3).
SU:   Checking validity of unnesting subquery SEL$3 (#4)
SU:     SU bypassed: Subquery in a view with rowid reference.
SU:   Validity checks failed.
Subquery Unnesting on query block DEL$1 (#1)SU: Performing unnesting that does not require costing.
SU: Considering subquery unnest on query block DEL$1 (#1).
SU:   Checking validity of unnesting subquery SEL$1 (#2)
SU:   Passed validity checks, but requires costing.
SU: Using search type: exhaustive
SU: Starting iteration 1, state space = (2) : (1)
SU:   Transform an ANY subquery to semi-join or distinct.

The above highlighted section clearly shows that – the “rowid’ referenced in the query block Q3, doesn’t point to “rowid’ referenced in query block Q2 and hence the subquery doesn’t transform into a join in Exadata platform.

delete from t where rowid in (
select /*+ qb_name(q3) */ rowid from (
select /*+ qb_name(q2) */ c1,c3,rowid, rank() over( partition by c1 order by c3 desc) rnk
from t
where c1 in (
select /*+ qb_name(q1) */ c1 from t
group by c1,c2
having count(*) > 1 )
       )
where rnk > 1 ) ;

However if we do alias the column “rowid” in the query block Q2, then plan changes – sub query transformed into a join- and completed within few seconds.

demo@EXACC> set serveroutput off
demo@EXACC> delete /*+ gather_plan_statistics */ from t where rowid in (
  2  select /*+ qb_name(q3) */ rid from (
  3  select /*+ qb_name(q2) */ c1,c3,rowid rid, rank() over( partition by c1 order by c3 desc) rnk
  4  from t
  5  where c1 in (
  6  select /*+ qb_name(q1) */ c1 from t
  7  group by c1,c2
  8  having count(*) > 1 )
  9     )
 10  where rnk > 1 ) ;

0 rows deleted.

demo@EXACC> select * from table( dbms_xplan.display_cursor(format=>'allstats last +alias'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  3kqqfx7cvx958, child number 0
-------------------------------------
delete /*+ gather_plan_statistics */ from t where rowid in ( select /*+
qb_name(q3) */ rid from ( select /*+ qb_name(q2) */ c1,c3,rowid rid,
rank() over( partition by c1 order by c3 desc) rnk from t where c1 in (
select /*+ qb_name(q1) */ c1 from t group by c1,c2 having count(*) > 1
)  ) where rnk > 1 )

Plan hash value: 1715516498

----------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------
|   0 | DELETE STATEMENT                    |          |      1 |        |      0 |00:00:00.01 |     348 |
|   1 |  DELETE                             | T        |      1 |        |      0 |00:00:00.01 |     348 |
|   2 |   NESTED LOOPS                      |          |      1 |      1 |      0 |00:00:00.01 |     348 |
|   3 |    VIEW                             | VW_NSO_2 |      1 |    502 |      0 |00:00:00.01 |     348 |
|   4 |     SORT UNIQUE                     |          |      1 |      1 |      0 |00:00:00.01 |     348 |
|*  5 |      VIEW                           |          |      1 |    502 |      0 |00:00:00.01 |     348 |
|   6 |       WINDOW SORT                   |          |      1 |    502 |     20 |00:00:00.01 |     348 |
|*  7 |        HASH JOIN RIGHT SEMI         |          |      1 |    502 |     20 |00:00:00.01 |     348 |
|   8 |         VIEW                        | VW_NSO_1 |      1 |    501 |     10 |00:00:00.01 |     174 |
|*  9 |          FILTER                     |          |      1 |        |     10 |00:00:00.01 |     174 |
|  10 |           SORT GROUP BY             |          |      1 |     26 |  10000 |00:00:00.01 |     174 |
|  11 |            TABLE ACCESS STORAGE FULL| T        |      1 |  10010 |  10010 |00:00:00.01 |     174 |
|  12 |         TABLE ACCESS STORAGE FULL   | T        |      1 |  10010 |  10010 |00:00:00.01 |     174 |
|  13 |    TABLE ACCESS BY USER ROWID       | T        |      0 |      1 |      0 |00:00:00.01 |       0 |
----------------------------------------------------------------------------------------------------------


Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$BB91A27F
   3 - SEL$186AFB95 / VW_NSO_2@SEL$BB91A27F
   4 - SEL$186AFB95
   5 - SEL$61D88735 / from$_subquery$_002@Q3
   6 - SEL$61D88735
   8 - SEL$CAF36C18 / VW_NSO_1@SEL$61D88735
   9 - SEL$CAF36C18
  11 - SEL$CAF36C18 / T@Q1
  12 - SEL$61D88735 / T@Q2
  13 - SEL$BB91A27F / T@DEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - filter("RNK">1)
   7 - access("C1"="C1")
   9 - filter(COUNT(*)>1)


50 rows selected.

demo@EXACC>

So the lesson learnt is when having the Pseudo columns in the Select statement have them properly aliased & never leave them open as such (specifically if those Pseudo columns are referenced in the outer most query block either in the filters/projections).


Monday, May 13, 2019

Table Elimination - Part V


Having discussed in the past (part1, part2, part3 and part4) about the Join elimination – ability of the optimizer to remove joins that are not required in the query. It is also important to know what precondition must be satisfied for join elimination.

To enable join elimination, the relationship between the tables must be implemented with foreign key constraints. If the constraints are missing, most of the query transformation are not possible at all. Therefore, it is highly recommended to use constraints.

In data warehouses, a common approach is to define the foreign key constraints with DISABLE NOVALIDATE state. In this case, the constraints are visible in the database and can be used for documentation and reverse engineering of the data model, but also for query transformation like join elimination and query rewrites. But the data loaded by the ETL process are not checked/validated against the constrains. Therefore, data consistency has to be guaranteed by the ETL process.

When a constraints is not validated, it should be defined as a reliable in the data warehouse. This is required if query rewrite on materialized view is used. But it is a precondition for join elimination too.

demo@ORA11G> create table emp as
  2  select *
  3  from scott.emp ;

Table created.

demo@ORA11G> create table dept as
  2  select *
  3  from scott.dept ;

Table created.

demo@ORA11G> alter table dept
  2  add constraint dept_pk
  3  primary key(deptno) rely novalidate;

Table altered.

demo@ORA11G> alter table emp
  2  add constraint emp_fk
  3  foreign key(deptno) references dept rely novalidate
  4  modify deptno not null;

Table altered.

demo@ORA11G> select constraint_name,table_name,constraint_type,status,validated,rely
  2  from user_constraints
  3  where table_name in ('EMP','DEPT')
  4  and constraint_type in ('P','R') ;

CONSTRAINT_NAME TABLE_NAME      C STATUS   VALIDATED     RELY
--------------- --------------- - -------- ------------- ----
DEPT_PK         DEPT            P ENABLED  NOT VALIDATED RELY
EMP_FK          EMP             R ENABLED  NOT VALIDATED RELY

demo@ORA11G> show parameter optimizer_features_enable

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_features_enable            string      11.2.0.4
demo@ORA11G> set autotrace traceonly explain
demo@ORA11G> select ename
  2  from emp e, dept d
  3  where e.deptno = d.deptno;

Execution Plan
----------------------------------------------------------
Plan hash value: 3956160932

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    14 |    98 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| EMP  |    14 |    98 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)

demo@ORA11G>

With NOVALIDATE and RELY constraints table DEPT was eliminated in the above sql. However, in Oracle 12c the plan looks slightly different.

demo@PDB1> show parameter optimizer_features_enable

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_features_enable            string      12.2.0.1
demo@PDB1> set autotrace traceonly explain
demo@PDB1> select ename
  2  from emp e, dept d
  3  where e.deptno = d.deptno;

Execution Plan
----------------------------------------------------------
Plan hash value: 4269077325

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |    14 |   168 |     3   (0)| 00:00:01 |
|   1 |  NESTED LOOPS      |         |    14 |   168 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| EMP     |    14 |   126 |     3   (0)| 00:00:01 |
|*  3 |   INDEX UNIQUE SCAN| DEPT_PK |     1 |     3 |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("E"."DEPTNO"="D"."DEPTNO")


In Oracle 12c, join elimination with NOVALIDATE will work only, if QUERY_REWRITE_INTEGRITY parameter set to TRUSTED. This is mentioned in the documentation.

If a foreign key constraint is in NOVALIDATE state, join elimination is not done when QUERY_REWRITE_INTEGRITY=enforced. This means that queries with joins over a foreign key constraint that is in RELY NOVALIDATE state can potentially take longer to parse and execute as the optimizer does not trust the RELY.

demo@PDB1> show parameter query_rewrite_integrity

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
query_rewrite_integrity              string      enforced
demo@PDB1> alter session set query_rewrite_integrity=Trusted;

Session altered.

demo@PDB1> set autotrace traceonly explain
demo@PDB1> select ename
  2  from emp e, dept d
  3  where e.deptno = d.deptno;

Execution Plan
----------------------------------------------------------
Plan hash value: 3956160932

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    14 |    84 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| EMP  |    14 |    84 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Note
-----
   - rely constraint used for this statement

demo@PDB1>

In Summary foreign key constraints in data warehouses are often defined with NOVALIDATE. To enable query transformation such as join elimination and query rewrite, the constraints have to be set to reliable with RELY keyword. Additionally, you have to take care that the parameter QUERY_REWRITE_INTEGRITY is set to TRUSTED on an Oracle 12c environment, otherwise join elimination will not work anymore.