THE RAJESH BLOG: Scalar Subquery un-nesting Transformation

A scalar subquery is a subquery which can only return a single value (single row, single column) back to the parent block. It can be used both in the WHERE clause of the parent query or right in the SELECT list of the parent query instead of some column name. (In this post I am discussing only the case of a scalar subquery in SELECT clause). Whatever the scalar subquery returns, will be put in the “fake” column in the query result-set. If it returns more than one row, you will get the ORA-01427: single-row subquery returns more than one row error, if it returns no rows for the given lookup, the result will be NULL.

rajesh@PDB1> create table t1

2 as

3 select *

4 from all_users;

Table created.

rajesh@PDB1>

rajesh@PDB1> create table t2

2 as

3 select b.*

4 from all_objects b ,

5 all_users a

6 where b.owner = a.username;

Table created.

rajesh@PDB1> set serveroutput off

So starting with Oracle 12c, the CBO transformation engine can unnest some types of scalar subqueries and convert those to Outer joins internally

rajesh@PDB1> select /*+ gather_plan_statistics no_unnest(@ssq) */ t1.user_id,

2 ( select /*+ qb_name(ssq) */ max(created)

3 from t2 where t1.username = t2.owner) max_created

4 from t1

5 where username like 'S%';

USER_ID MAX_CREATED

---------- -----------------------

105 28-NOV-2014 02:28:46 PM

78 11-SEP-2014 09:58:01 AM

2147483619

2147483618

2147483617

8 11-SEP-2014 09:13:44 AM

0 06-JAN-2015 09:43:48 AM

9 rows selected.

rajesh@PDB1> select * from table( dbms_xplan.display_cursor(format=>'allstats last') );

PLAN_TABLE_OUTPUT

-------------------------------------------------------------------------------------------------------------------

SQL_ID 3892tkr4xayk5, child number 0

-------------------------------------

select /*+ gather_plan_statistics no_unnest(@ssq) */ t1.user_id, (

select /*+ qb_name(ssq) */ max(created) from t2 where t1.username =

t2.owner) max_created from t1 where username like 'S%'

Plan hash value: 367820

---------------------------------------------------------------------------------

---------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | | 9 | 4 | 5 |

| 1 | SORT AGGREGATE | | 9 | 1 | 9 | 7857 | 868 |

|* 2 | TABLE ACCESS FULL| T2 | 9 | 2090 | 42155 | 7857 | 868 |

|* 3 | TABLE ACCESS FULL | T1 | 1 | 3 | 9 | 4 | 5 |

---------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("T2"."OWNER"=:B1)

3 - filter("USERNAME" LIKE 'S%')

The above plan indicates that there is some scalar subquery is going on in the SQL statement, from the “Starts” column we can see that outer table T1 is scanned only once but the table T2 is scanned 9 times (once for each row returned by outer table T1), so the scalar subquery behaves like a FILTER loop. Where it executes the subquery once for each row passed to it by the parent query.

So now let us remove this no_unnest(@ssq)hint and hope for 12c improvement to kick in

rajesh@PDB1> select /*+ gather_plan_statistics */ t1.user_id,

2 ( select max(created)

3 from t2 where t1.username = t2.owner) max_created

4 from t1

5 where username like 'S%';

USER_ID MAX_CREATED

---------- -----------------------

0 06-JAN-2015 09:43:48 AM

2147483618

8 11-SEP-2014 09:13:44 AM

2147483619

105 28-NOV-2014 02:28:46 PM

2147483617

78 11-SEP-2014 09:58:01 AM

9 rows selected.

rajesh@PDB1> select * from table( dbms_xplan.display_cursor(format=>'allstats last') );

PLAN_TABLE_OUTPUT

-------------------------------------------------------------------------------------------------------------------

SQL_ID 7p5b151cb0tg8, child number 0

-------------------------------------

select /*+ gather_plan_statistics */ t1.user_id, ( select

max(created) from t2 where t1.username = t2.owner)

max_created from t1 where username like 'S%'

Plan hash value: 577572187

----------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | | 9 | 876 | | | |

| 1 | HASH GROUP BY | | 1 | 5 | 9 | 876 | 1034K| 1034K| 1232K (0)|

|* 2 | HASH JOIN OUTER | | 1 | 471 | 42160 | 876 | 1214K| 1214K| 987K (0)|

|* 3 | TABLE ACCESS FULL| T1 | 1 | 3 | 9 | 3 | | | |

|* 4 | TABLE ACCESS FULL| T2 | 1 | 4367 | 42155 | 873 | | | |

----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("T1"."USERNAME"="T2"."OWNER")

3 - filter("USERNAME" LIKE 'S%')

4 - filter("T2"."OWNER" LIKE 'S%')

Both tables were scanned only once and the total number of buffer gets dropped from 7857 to 876. In the real life case, flattening and de-compartmentalizing a big query will open up more join orders and data access optimizations for the CBO, resulting in a better plans.

Whenever the scalar subquery transformation is done, the following entry shows up in the optimizer 10053 trace.

Final query after transformations:******* UNPARSED QUERY IS *******

SELECT "T1"."USER_ID" "USER_ID",

MAX("T2"."CREATED") "MAX_CREATED"

FROM "RAJESH"."T2" "T2",

"RAJESH"."T1" "T1"

WHERE "T1"."USERNAME" LIKE 'S%'

AND "T1"."USERNAME"="T2"."OWNER"(+)

AND "T2"."OWNER"(+) LIKE 'S%'

GROUP BY "T2"."OWNER",

"T1".ROWID,

"T1"."USER_ID"

So over all the auto trace results for this transformation are below.

rajesh@PDB1> set autotrace traceonly explain statistics

rajesh@PDB1> select /*+ no_unnest(@ssq) */ t1.user_id,

2 ( select /*+ qb_name(ssq) */ max(created)

3 from t2 where t1.username = t2.owner) max_created

4 from t1

5 where username like 'S%';

9 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 367820

---------------------------------------------------------------------------

---------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 3 | 42 | 331 (1)| 00:00:01 |

| 1 | SORT AGGREGATE | | 1 | 13 | | |

|* 2 | TABLE ACCESS FULL| T2 | 2090 | 27170 | 243 (0)| 00:00:01 |

|* 3 | TABLE ACCESS FULL | T1 | 3 | 42 | 3 (0)| 00:00:01 |

---------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("T2"."OWNER"=:B1)

3 - filter("USERNAME" LIKE 'S%')

Statistics

----------------------------------------------------------

1 recursive calls

0 db block gets

7861 consistent gets

0 physical reads

0 redo size

770 bytes sent via SQL*Net to client

552 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

9 rows processed

rajesh@PDB1> select t1.user_id,

2 ( select max(created)

3 from t2 where t1.username = t2.owner) max_created

4 from t1

5 where username like 'S%';

9 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 577572187

----------------------------------------------------------------------------

----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 5 | 195 | 248 (1)| 00:00:01 |

| 1 | HASH GROUP BY | | 5 | 195 | 248 (1)| 00:00:01 |

|* 2 | HASH JOIN OUTER | | 471 | 18369 | 247 (1)| 00:00:01 |

|* 3 | TABLE ACCESS FULL| T1 | 3 | 78 | 3 (0)| 00:00:01 |

|* 4 | TABLE ACCESS FULL| T2 | 4367 | 56771 | 244 (1)| 00:00:01 |

----------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("T1"."USERNAME"="T2"."OWNER"(+))

3 - filter("USERNAME" LIKE 'S%')

4 - filter("T2"."OWNER"(+) LIKE 'S%')

Statistics

----------------------------------------------------------

1 recursive calls

0 db block gets

876 consistent gets

0 physical reads

0 redo size

771 bytes sent via SQL*Net to client

552 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

9 rows processed

rajesh@PDB1> set autotrace off

rajesh@PDB1>

Note that when I use COUNT(*) instead of MAX(created) in the above query, I don’t see Outer join + group by combo. With the following message in trace file.

Subquery Unnest (SU)

********************

SU: bypassed: Scalar subquery has null-mutating select item.

SJC: Considering set-join conversion in query block SEL$1 (#0)

*************************

This transformation in 12c database is controlled by parameter “_optimizer_unnest_scalar_sq”

3 comments:

RAJESHFebruary 12, 2016 at 6:00 AM
Note to self.
In case of count (*) we don’t see Scalar sub-queries un-nesting happen.

rajesh@ORA12C> select t1.user_id,
2 ( select count(*)
3 from t2
4 where t1.username = t2.owner) max_created
5 from t1
6 where username like 'S%'
7 /

Execution Plan
----------------------------------------------------------
Plan hash value: 367820

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 39 | 356 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 5 | | |
|* 2 | TABLE ACCESS FULL| T2 | 1707 | 8535 | 246 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL | T1 | 3 | 39 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("T2"."OWNER"=:B1)
3 - filter("USERNAME" LIKE 'S%')

Optimizer cannot transform the given query

select t1.user_id,
( select count(*)
from t2
where t1.username = t2.owner) max_created
from t1
where username like 'S%';

Into like this

SELECT "T1"."USER_ID" "USER_ID",
count(*) "MAX_CREATED"
FROM "RAJESH"."T2" "T2",
"RAJESH"."T1" "T1"
WHERE "T1"."USERNAME" LIKE 'S%'
AND "T1"."USERNAME"="T2"."OWNER"(+)
AND "T2"."OWNER"(+) LIKE 'S%'
GROUP BY "T2"."OWNER",
"T1".ROWID,
"T1"."USER_ID"

a) Do an outer join between T1 and T2
b) Perform count(*) on top of the result set. Which is very different from just counting the rows in T2.

We cannot take an optimization performed for an aggregate function MAX() and apply the same to COUNT(*) (since both max() and count(*) belong to aggregates, they are different in functionality )

So in short the rule to perform scalar subquery unnesting for COUNT (*) queries should be like this.

a) Check if T2 has any not null column
1) if yes then kick-in scalar subquery unnesting (perform outer join between T1 and T2)
2) replace count(*) in the query with count(that_not_nullable_column)

b) if T2 dont have any not null column then
1) dont perform scalar subquery unnesting, (since doing that would produce incorrect results). So have it as such.

But as of 12.1.0.2 the optimizer don't have the above steps coded for COUNT(*) aggregates in scalar subquery un-nesting.
AnonymousJuly 28, 2017 at 4:21 AM
Is it good to set the value of _optimizer_unnest_scalar_sq to FALSE or leave the default value TRUE

Tuesday, January 6, 2015

Scalar Subquery un-nesting Transformation

3 comments: