One of the most common myth about index is, if any SQL
statement is about to retrieve one percent of data from a table, then the index
based access would be the best option.
However that is not the case **always**, it depends on
clustering of the data.
If the data is very well clustered according to the index,
then the cost of reading the table via the index would be the cheap.
Created a table with unique index on ID column and a very
good appealing clustering factor.
demo@ORA12C>
create table t (id number NOT NULL, some_other_datas varchar2(300) );
Table created.
demo@ORA12C>
create unique index t_idx on t(id);
Index created.
demo@ORA12C>
insert into t (id,some_other_datas)
2
select rownum, rpad('*',200,'*')
3 from
all_objects, all_users
4 where
rownum <=2e6;
2000000 rows
created.
demo@ORA12C>
exec dbms_stats.gather_table_stats(user,'T');
PL/SQL
procedure successfully completed.
demo@ORA12C>
select i.index_name, i.clustering_factor, t.num_rows, t.blocks
2 from
user_indexes i ,
3 user_tables
t
4
where i.table_name = t.table_name
5 and
i.index_name ='T_IDX';
INDEX_NAME
CLUSTERING_FACTOR NUM_ROWS BLOCKS
----------
----------------- ---------- ----------
T_IDX 58824 2000000
59057
demo@ORA12C>
To retrieve one percent of data from this table, the optimizer picks
the index as best access path.
demo@ORA12C>
alter session set statistics_level=all;
Session
altered.
demo@ORA12C>
set serveroutput off
demo@ORA12C>
set feedback only
demo@ORA12C>
select *
2 from
t
3
where id between 20000 and 40000;
20001 rows
selected.
demo@ORA12C>
set feedback 6
demo@ORA12C>
select * from table( dbms_xplan.display_cursor(format=>'allstats last'));
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------
SQL_ID b7jpkag179cf6, child number 0
-------------------------------------
select * from t
where id between 20000 and 40000
Plan hash
value: 767293772
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time
| Buffers |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |
1 | | 20001 |00:00:00.01 | 889 |
| 1 |
TABLE ACCESS BY INDEX ROWID BATCHED| T |
1 | 20002 | 20001 |00:00:00.01 | 889 |
|* 2 |
INDEX RANGE SCAN | T_IDX | 1 | 20002 |
20001 |00:00:00.01 | 174 |
-------------------------------------------------------------------------------------------------------
Predicate
Information (identified by operation id):
---------------------------------------------------
2 - access("ID">=20000 AND
"ID"<=40000)
19 rows
selected.
demo@ORA12C>
The query completed in less than a second returning about 20K rows doing
890 Logical IOs.
Now let’s truncate and reload this table with a different set of data,
having a poor clustering factor.
demo@ORA12C>
truncate table t;
Table
truncated.
demo@ORA12C>
insert into t (id,some_other_datas)
2
select dbms_random.value(1,2000000), rpad('*',200,'*')
3 from all_objects, all_users
4
where rownum <=2e6;
2000000 rows
created.
demo@ORA12C>
exec dbms_stats.gather_table_stats(user,'T');
PL/SQL
procedure successfully completed.
demo@ORA12C>
select i.index_name, i.clustering_factor, t.num_rows, t.blocks
2 from
user_indexes i ,
3
user_tables t
4
where i.table_name = t.table_name
5 and
i.index_name ='T_IDX';
INDEX_NAME
CLUSTERING_FACTOR NUM_ROWS BLOCKS
----------
----------------- ---------- ----------
T_IDX 2005915 2000000
65177
Now to retrieve one percent of data from this table, the optimizer
picks the Full table scan as best access path.
demo@ORA12C>
alter session set statistics_level=all;
Session
altered.
demo@ORA12C>
set serveroutput off
demo@ORA12C>
set feedback only
demo@ORA12C>
select *
2 from
t
3
where id between 20000 and 40000;
20119 rows
selected.
demo@ORA12C>
set feedback 6
demo@ORA12C>
select * from table( dbms_xplan.display_cursor(format=>'allstats last'));
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------
SQL_ID b7jpkag179cf6, child number 0
-------------------------------------
select * from t
where id between 20000 and 40000
Plan hash
value: 1601196873
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows
| A-Time | Buffers | Reads |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT |
| 1 | |
20119 |00:00:06.42 | 129K|
64669 |
|* 1 |
TABLE ACCESS FULL| T |
1 | 20002 | 20119 |00:00:06.42 | 129K|
64669 |
---------------------------------------------------------------------------------------------
Predicate
Information (identified by operation id):
---------------------------------------------------
1 - filter(("ID"<=40000 AND
"ID">=20000))
18 rows
selected.
This look around 6 sec to complete and returned 20K rows by doing 129K
logical IOs.
Though the optimizer doesn’t pick the index to answer this query, if we
force the index via a Hint, things become worse like this.
demo@ORA12C>
set feedback only
demo@ORA12C>
select /*+ index(t,t_idx) */ *
2 from
t
3
where id between 20000 and 40000;
20119 rows
selected.
demo@ORA12C>
set feedback 6
demo@ORA12C>
select * from table( dbms_xplan.display_cursor(format=>'allstats last'));
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------
SQL_ID 478424k8ajjqh, child number 0
-------------------------------------
select /*+
index(t,t_idx) */ * from t where id between 20000 and 40000
Plan hash
value: 767293772
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time
| Buffers | Reads |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT |
| 1 | |
20119 |00:02:42.33 | 37768 |
18066 |
| 1 |
TABLE ACCESS BY INDEX ROWID BATCHED| T |
1 | 20002 | 20119 |00:02:42.33 | 37768 |
18066 |
|* 2 |
INDEX RANGE SCAN | T_IDX | 1 |
20002 | 20119 |00:00:02.01
| 293 | 10 |
----------------------------------------------------------------------------------------------------------------
Predicate
Information (identified by operation id):
---------------------------------------------------
2 - access("ID">=20000 AND
"ID"<=40000)
19 rows
selected.
demo@ORA12C>
Using index the sql took around 160 sec, where as a Full table scan did
only 6 sec.
So the answer to this question – if a SQL is about to return one percent
of data from a table, then the choice to use either the full table scan or the
index based access purely depends on clustering of the data.
Thank you for the posts related to performance Tunings:
ReplyDeleteCOuld you please help us to undertand How to create profiles.
How to create baselines
and How troubleshoot Buffer Busy waits - During INSERTS
-A big Follower of your blog. Thank you so much for your efforts.
Waiting for the new posts soon.
Regards,
Smita