A scalar subquery is a subquery which can only
return a single value (single row, single column) back to the parent block. It
can be used both in the WHERE clause of the parent query or right in the SELECT
list of the parent query instead of some column name. (In this post I am
discussing only the case of a scalar subquery in SELECT clause). Whatever the
scalar subquery returns, will be put in the “fake” column in the query
result-set. If it returns more than one row, you will get the ORA-01427:
single-row subquery returns more than one row error, if it returns no rows for
the given lookup, the result will be NULL.
So
starting with Oracle 12c, the CBO transformation engine can unnest some types
of scalar subqueries and convert those to Outer joins internally
rajesh@PDB1> create table t1
2 as
3 select *
4 from all_users;
Table created.
rajesh@PDB1>
rajesh@PDB1> create table t2
2 as
3 select b.*
4 from all_objects b ,
5 all_users a
6 where b.owner = a.username;
Table created.
rajesh@PDB1> set serveroutput off
rajesh@PDB1> select /*+ gather_plan_statistics
no_unnest(@ssq) */ t1.user_id,
2 ( select /*+ qb_name(ssq) */ max(created)
3 from t2 where t1.username =
t2.owner) max_created
4 from t1
5 where username like 'S%';
USER_ID MAX_CREATED
---------- -----------------------
105 28-NOV-2014
02:28:46 PM
90
87
78 11-SEP-2014
09:58:01 AM
2147483619
2147483618
2147483617
8 11-SEP-2014 09:13:44
AM
0 06-JAN-2015
09:43:48 AM
9 rows selected.
rajesh@PDB1> select * from table(
dbms_xplan.display_cursor(format=>'allstats last') );
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------
SQL_ID 3892tkr4xayk5,
child number 0
-------------------------------------
select /*+ gather_plan_statistics no_unnest(@ssq) */
t1.user_id, (
select /*+ qb_name(ssq) */ max(created) from t2 where t1.username =
t2.owner) max_created from t1 where username like 'S%'
Plan hash value: 367820
---------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows |
Buffers | Reads |
---------------------------------------------------------------------------------
| 0 | SELECT
STATEMENT | |
1 | | 9 |
4 | 5 |
| 1 | SORT AGGREGATE |
| 9 | 1 |
9 | 7857 | 868 |
|* 2 | TABLE ACCESS FULL| T2 |
9 | 2090 | 42155 |
7857 | 868 |
|* 3 | TABLE ACCESS FULL | T1 |
1 | 3 | 9 |
4 | 5 |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 -
filter("T2"."OWNER"=:B1)
3 -
filter("USERNAME" LIKE 'S%')
The above plan indicates that there is
some scalar subquery is going on in the SQL statement, from the “Starts” column we can see that outer
table T1 is scanned only once but the table T2 is scanned 9 times (once for
each row returned by outer table T1), so the scalar subquery behaves like a FILTER loop. Where it executes the
subquery once for each row passed to it by the parent query.
So now let us remove this no_unnest(@ssq)hint and hope for 12c improvement to kick in
rajesh@PDB1> select /*+ gather_plan_statistics */ t1.user_id,
2 ( select max(created)
3 from t2 where t1.username =
t2.owner) max_created
4 from t1
5 where username like 'S%';
USER_ID MAX_CREATED
---------- -----------------------
0 06-JAN-2015
09:43:48 AM
2147483618
8 11-SEP-2014
09:13:44 AM
2147483619
90
87
105 28-NOV-2014
02:28:46 PM
2147483617
78 11-SEP-2014
09:58:01 AM
9 rows selected.
rajesh@PDB1> select * from table(
dbms_xplan.display_cursor(format=>'allstats last') );
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------
SQL_ID 7p5b151cb0tg8,
child number 0
-------------------------------------
select /*+ gather_plan_statistics */ t1.user_id, ( select
max(created)
from t2 where t1.username = t2.owner)
max_created from t1 where username like 'S%'
Plan hash value: 577572187
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows |
Buffers | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------
| 0 | SELECT
STATEMENT | |
1 | | 9 |
876 | | |
|
| 1 | HASH GROUP BY | |
1 | 5 | 9 |
876 | 1034K| 1034K| 1232K (0)|
|* 2 | HASH JOIN OUTER |
| 1 | 471 |
42160 | 876 | 1214K|
1214K| 987K (0)|
|* 3 | TABLE ACCESS FULL| T1 |
1 | 3 | 9 |
3 | | | |
|* 4 | TABLE ACCESS FULL| T2 |
1 | 4367 | 42155 |
873 | | | |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 -
access("T1"."USERNAME"="T2"."OWNER")
3 -
filter("USERNAME" LIKE 'S%')
4 -
filter("T2"."OWNER" LIKE 'S%')
Both tables were scanned only once and
the total number of buffer gets dropped from 7857 to 876. In the real life
case, flattening and de-compartmentalizing a big query will open up more join
orders and data access optimizations for the CBO, resulting in a better plans.
Whenever the scalar subquery
transformation is done, the following entry shows up in the optimizer 10053
trace.
Final query after transformations:******* UNPARSED QUERY IS
*******
SELECT "T1"."USER_ID" "USER_ID",
MAX("T2"."CREATED") "MAX_CREATED"
FROM "RAJESH"."T2"
"T2",
"RAJESH"."T1"
"T1"
WHERE "T1"."USERNAME" LIKE 'S%'
AND "T1"."USERNAME"="T2"."OWNER"(+)
AND "T2"."OWNER"(+)
LIKE 'S%'
GROUP BY "T2"."OWNER",
"T1".ROWID,
"T1"."USER_ID"
So over all the auto trace results for
this transformation are below.
rajesh@PDB1> set autotrace traceonly explain statistics
rajesh@PDB1> select /*+ no_unnest(@ssq) */ t1.user_id,
2 ( select /*+ qb_name(ssq) */ max(created)
3 from t2 where t1.username =
t2.owner) max_created
4 from t1
5 where username like 'S%';
9 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 367820
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT
STATEMENT | |
3 | 42 | 331
(1)| 00:00:01 |
| 1 | SORT AGGREGATE |
| 1 | 13 | | |
|* 2 | TABLE ACCESS FULL| T2 |
2090 | 27170 | 243 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL | T1 |
3 | 42 | 3
(0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 -
filter("T2"."OWNER"=:B1)
3 -
filter("USERNAME" LIKE 'S%')
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
7861 consistent gets
0
physical reads
0 redo size
770 bytes sent via SQL*Net to client
552 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
9 rows processed
rajesh@PDB1> select t1.user_id,
2 ( select max(created)
3 from t2 where t1.username =
t2.owner) max_created
4 from t1
5 where username like 'S%';
9 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 577572187
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT
STATEMENT | |
5 | 195 | 248
(1)| 00:00:01 |
| 1 | HASH GROUP BY |
| 5 | 195 |
248 (1)| 00:00:01 |
|* 2 | HASH JOIN OUTER |
| 471 |
18369 | 247 (1)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| T1 |
3 | 78 | 3
(0)| 00:00:01 |
|* 4 | TABLE ACCESS FULL| T2 |
4367 | 56771 | 244 (1)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 -
access("T1"."USERNAME"="T2"."OWNER"(+))
3 -
filter("USERNAME" LIKE 'S%')
4 -
filter("T2"."OWNER"(+) LIKE 'S%')
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
876 consistent gets
0 physical reads
0 redo size
771
bytes sent via SQL*Net to client
552 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
9 rows processed
rajesh@PDB1> set autotrace off
rajesh@PDB1>
Note that when I use COUNT(*) instead of
MAX(created) in the above query, I don’t see Outer join + group by combo. With the
following message in trace file.
Subquery Unnest (SU)
********************
SU: bypassed: Scalar subquery has null-mutating select
item.
SJC: Considering set-join conversion in query block SEL$1 (#0)
*************************
This transformation in 12c database is
controlled by parameter “_optimizer_unnest_scalar_sq”
Note to self.
ReplyDeleteIn case of count (*) we don’t see Scalar sub-queries un-nesting happen.
rajesh@ORA12C> select t1.user_id,
2 ( select count(*)
3 from t2
4 where t1.username = t2.owner) max_created
5 from t1
6 where username like 'S%'
7 /
Execution Plan
----------------------------------------------------------
Plan hash value: 367820
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 39 | 356 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 5 | | |
|* 2 | TABLE ACCESS FULL| T2 | 1707 | 8535 | 246 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL | T1 | 3 | 39 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("T2"."OWNER"=:B1)
3 - filter("USERNAME" LIKE 'S%')
Optimizer cannot transform the given query
select t1.user_id,
( select count(*)
from t2
where t1.username = t2.owner) max_created
from t1
where username like 'S%';
Into like this
SELECT "T1"."USER_ID" "USER_ID",
count(*) "MAX_CREATED"
FROM "RAJESH"."T2" "T2",
"RAJESH"."T1" "T1"
WHERE "T1"."USERNAME" LIKE 'S%'
AND "T1"."USERNAME"="T2"."OWNER"(+)
AND "T2"."OWNER"(+) LIKE 'S%'
GROUP BY "T2"."OWNER",
"T1".ROWID,
"T1"."USER_ID"
a) Do an outer join between T1 and T2
b) Perform count(*) on top of the result set. Which is very different from just counting the rows in T2.
We cannot take an optimization performed for an aggregate function MAX() and apply the same to COUNT(*) (since both max() and count(*) belong to aggregates, they are different in functionality )
So in short the rule to perform scalar subquery unnesting for COUNT (*) queries should be like this.
a) Check if T2 has any not null column
1) if yes then kick-in scalar subquery unnesting (perform outer join between T1 and T2)
2) replace count(*) in the query with count(that_not_nullable_column)
b) if T2 dont have any not null column then
1) dont perform scalar subquery unnesting, (since doing that would produce incorrect results). So have it as such.
But as of 12.1.0.2 the optimizer don't have the above steps coded for COUNT(*) aggregates in scalar subquery un-nesting.
Is it good to set the value of _optimizer_unnest_scalar_sq to FALSE or leave the default value TRUE
ReplyDelete@Anonymous:
Deletedon't set any hidden parameters ( parameter start with _) without Oracle support guidance.