Monday, April 5, 2010

CURSOR_SHARING = SIMILAR / FORCE / EXACT

CURSOR_SHARING is a parameter Oracle Database uses to control whether it will "auto-bind" a SQL statement. Oracle Database can take a query of the form SELECT * FROM TABLE WHERE COL = 'literal' and replace the 'literal' with a bind value—so the predicate will become WHERE COL = :"SYS_B_0". This permits the reuse of the generated query plan, perhaps leading to better utilization of the shared pool and a reduction of hard parses performed by the system. The CURSOR_SHARING parameter can have one of three values:

  • EXACT: This is the default setting. With this value in place, the query is not rewritten to use bind variables.
  • FORCE: This setting rewrites the query, replacing all literals with bind values and setting up a one-size-fits-all plan—a single plan for the rewritten query. I'll demonstrate what that implies in a moment.
  • SIMILAR: This setting also rewrites the query, replacing the literals with bind variables, but can set up different plans for different bind variable combinations. This last point is why CURSOR_SHARING=SIMILAR might reduce the number of plans generated. Because multiple plans may be generated, the setting of SIMILAR may or may not reduce the number of actual plans you observe in the shared pool

Let's first look at what might happen if I execute the same set of queries with these three settings. The queries will simply be SELECT * FROM DUAL WHERE DUMMY = , and I'll use 'A' and 'B' for . Then I'll look in the shared pool via the V$SQL dynamic performance view and see how many cursors were set up for each query. Listing 1 sets the three CURSOR_SHARING values, runs SELECT queries, and looks at the content of the SQL_TEXT column in V$SQL to see the actual SQL used in the queries.


Code Listing 1: EXACT, FORCE, and SIMILAR for CURSOR_SHARING

alter system flush shared_pool;
alter session set cursor_sharing=exact;
select * from dual cs_exact where dummy = 'A';
select * from dual cs_exact where dummy = 'B';

alter session set cursor_sharing=force;
select * from dual cs_force where dummy = 'A';
select * from dual cs_force where dummy = 'B';

alter session set cursor_sharing=similar;
select * from dual cs_similar where dummy = 'A';
select * from dual cs_similar where dummy = 'B';

rajesh@
10GR2> select sql_text from v$sql
  2  where UPPER(sql_text) like 'SELECT%*%FROM%DUAL%CS%';

SQL_TEXT
------------------------------------------------------------------
select * from dual cs_similar where dummy = :"SYS_B_0"
select * from dual cs_force where dummy = :"SYS_B_0"
select * from dual cs_exact where dummy = 'A'
select * from dual cs_exact where dummy = 'B'


As you can see in Listing 1, with CURSOR_SHARING=EXACT (the default), every unique SQL statement I submit will create a new entry in V$SQL, it will be hard-parsed, and an execution plan will be created just for it. There can be hundreds or thousands of very similar queries in the shared pool that differ only in the literals used in the SQL statement itself. This implies that the application itself is not using bind variables, and that implies that the database is forced to hard-parse virtually every query, which, in turn, not only consumes a lot of CPU cycles but also leads to decreased scalability. The database just cannot hard-parse hundreds or thousands of SQL statements concurrently—the application ends up waiting for the shared pool to become available. One of the major scalability inhibitors in the database is not using bind variables. That was the motivation behind adding CURSOR_SHARING=FORCE in Oracle8i Release 2 (8.1.6)—to help alleviate this performance and scalability inhibitor.

With CURSOR_SHARING=FORCE in place in Listing 1, the database generated only one shareable query in the shared pool—it replaced 'A' and 'B' with :"SYS_B_0" and made the cursor shareable by as many sessions as would need it. In general, just one query plan would be reused by all sessions. This would turn the hard parse into a soft parse, which would consume fewer resources and simultaneously increase the scalability of the system, by allowing for more concurrent work, because a soft parse needs to "latch" (use a certain type of lock on) the shared pool less than a hard parse.

However, looking at the example in Listing 1 might lead you to assume that the settings of FORCE and SIMILAR are the same—the results certainly seem that way right now, because both resulted in a single plan. So what is the difference between these two settings? I'll need another example to show that, but I can describe it first. When CURSOR_SHARING is set to SIMILAR, Oracle Database will replace all literals with bind variables, just as FORCE would, but the SIMILAR value does one other thing—it looks at each literal it replaces and asks, "Could different values for this bind variable lead, in turn, to different plans?" For example, if the predicate WHERE X=6 implies that I would want to use a full scan but the predicate WHERE X=5 implies that I would want to use an index range scan, the database would recognize that and set up different plans for me. In the case of different plans, you mark the bind variable as unsafe and add its value to the signature of the query, so to reuse this cursor, you must not only have the same SQL statement but also the same value for that particular bind variable.

That is why SIMILAR might reduce the number of plans you see in the shared pool but, then again, might not. In order to let you observe this and really see what is happening, I'll set up a table with some very skewed data—so skewed that when I query WHERE ID=1, Oracle Database will want to use an index on ID, and when I query WHERE ID=99, Oracle Database will not want to use an index. Listing 2 creates the skewed data and index and returns the execution plans for the skewed data.

Code Listing 2: Creating table, index, and plans for skewed data.

create table t
as
select decode(rownum,1,1,99) as id,
        a.*
from all_objects a;

create index t_ind on t(id);

begin
    dbms_stats.gather_table_stats(ownname=>USER,
                                        tabname=>'T',
                                        estimate_percent=>dbms_stats.auto_sample_size,
                                        cascade=>true,
                                        method_opt=>'for all indexed columns size 254');
end;
/

rajesh@
10GR2> set autotrace traceonly explain;
rajesh@
10GR2> select * from t
  2  where id = 1;
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    97 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     |     1 |    97 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T_IND |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

rajesh@
10GR2> select * from t
  2  where id = 99;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 56777 |  5378K|   185   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T    | 56777 |  5378K|   185   (2)| 00:00:03 |
--------------------------------------------------------------------------




Table T in Listing 2 contains a column ID, which is very much skewed—most of the values are 99, with one record containing a value of 1. After I index and gather statistics on the table (generating histograms on that indexed column, so the optimizer knows that the data is skewed), I can see that the optimizer prefers an index range scan over a full scan when ID=1 is used and vice versa for ID=99.

Now, let's query that table with different ID= predicates, using CURSOR_SHARING=FORCE and then SIMILAR, as shown in Listing 3 (I already know what to expect with CURSOR_SHARING=EXACT—a single plan for each unique ID value supplied). 

Code Listing 3: FORCE, SIMILAR, and skewed data.

alter system flush shared_pool;  
alter session set cursor_sharing=force;
select * from t cs_force where id = 1;
select * from t cs_force where id = 50;
select * from t cs_force where id = 99;
select * from t cs_force where id = 1;
select * from t cs_force where id = 50;
select * from t cs_force where id = 99;

alter session set cursor_sharing=similar;
select * from t cs_similar where id = 1;
select * from t cs_similar where id = 50;
select * from t cs_similar where id = 99;
select * from t cs_similar where id = 1;
select * from t cs_similar where id = 50;
select * from t cs_similar where id = 99;

rajesh@10GR2> select sql_text from v$sql
  2  where UPPER(sql_text) like 'SELECT%*%FROM%T%CS%';

SQL_TEXT
---------------------------------------------------------
select * from t cs_force where id = :"SYS_B_0"
select * from t cs_similar where id = :"SYS_B_0"
select * from t cs_similar where id = :"SYS_B_0"
select * from t cs_similar where id = :"SYS_B_0"

As you can see in Listing 3, when CURSOR_SHARING=FORCE was true, one—and only one—plan was generated. It was, in fact, the one-size-fits-all plan, and in this case, the plan would use the index range scan (because the first query I parsed used ID=1 and the optimizer would use that bind variable value to generate the plan).

However, when CURSOR_SHARING=SIMILAR was true in Listing 3, three plans were generated, because the optimizer detected that a different value used when searching against the ID column could lead to a different plan (the statistics generated in Listing 2 gave it that information). Hence the actual bind variable value was added to the signature of that query plan and only a query with exactly the same signature could reuse the plan. That was the purpose of running each of the queries twice: to show that cursor reuse is possible. There were not six queries in V$SQL, just four. With CURSOR_SHARING=SIMILAR cursor reuse is not guaranteed, by design.

So, does that mean that for any unique set of literals, CURSOR_SHARING=SIMILAR will generate a new plan? No, I already saw that demonstrated with the DUAL table in Listing 1 when using WHERE DUMMY='A' and WHERE DUMMY='B'. It is only when the bind variable substitution is deemed unsafe that CURSOR_SHARING=SIMILAR will generate a new plan. Using the example in Listing 2, the only unsafe binding is against the ID column—if I query against that column and some other column but keep the ID column constant, I'll see cursor reuse, as shown in Listing 4.

Code Listing 4: CURSOR_SHARING=SIMILAR  
 

alter system flush shared_pool;
alter session set cursor_sharing=similar;
select * from t cs_similar where id = 1 and object_id =1;
select * from t cs_similar where id = 1 and object_id =2;
select * from t cs_similar where id = 1 and object_id =3;

rajesh@
10GR2> select sql_text
  2  from v$sql
  3  where upper(sql_text) like 'SELECT%*%FROM%T%CS%';

SQL_TEXT
----------------------------------------------------------------------------
select * from t cs_similar where id = :"SYS_B_0" and object_id =:"SYS_B_1" 


I varied the literals used to search against the OBJECT_ID column—but not the ID column—in Listing 4. The optimizer recognized that OBJECT_ID is safe; it would not generate different plans based on different inputs for that column, so it did not add OBJECT_ID to the signature of the cursor. Only when a different value is used against ID will a new plan be generated. So, that shows that CURSOR_SHARING=SIMILAR might reduce the number of entries you see in the shared pool. If this application were to vary the literals used against the ID column and use hundreds or thousands of unique values, CURSOR_SHARING=SIMILAR would not have a significant impact on the shared pool utilization.
 

11 comments:

  1. Good explanation.Was very useful.

    ReplyDelete
  2. Wonderfull explanation. Thanks for sharing knowledge...

    Thanks
    jamsher

    ReplyDelete
  3. Great Post. Very thorough..

    ReplyDelete
  4. Nice explanation,Thanks for sharing this.

    ReplyDelete
  5. Very Good explanation. Keep up the work and do more like this.

    ReplyDelete
  6. In the last phrase, you said that: "If this application were to vary the literals used against the ID column and use hundreds or thousands of unique values, CURSOR_SHARING=SIMILAR would not have a significant impact on the shared pool utilization."

    would NOT have?... I think that it would HAVE a significant impact on the shared pool, as any different literal (from those hundreds or thousands) should generate another different execution plan, and the optimizer should add hundreds or thousands of signatures to the query in the shared pool.
    Am I wrong?

    ReplyDelete
  7. It is truly a well-researched content and excellent wording. I got so engaged in this material that I couldn’t wait reading. I am impressed with your work and skill. Thanks. dark0de market url

    ReplyDelete