Wednesday, February 21, 2018

Virtual index (fake index)

At my work got few questions about Virtual index from fellow DBA, how about creating virtual index on a live production system and having them tested against the production workloads.
Before getting into those response in detail, thought it worth a blog post to help explain about virtual index.
It is possible to create Virtual index (aka fake index) manually using NOSEGMENT clause, kind of “pretend” to see what the CBO might do if such an index really exists.
Creating a virtual index doesn’t consume any storage and doesn’t maintained any way during the DML operations on the table, and it can’t be altered or rebuild like the real index. You can do stats gather (using dbms_stats) or analyze validate structure but no details will be available in user_indexes and index_stats data dictionary.
It is only visible to the CBO if and only if the session has set this hidden parameter (_use_nosegment_indexes = true)
The CBO will now consider to cost the index and potentially include them in the explain plan, however at runtime Oracle can of course not use that index and revert to the next best thing.
Let’s start with a demo. With a table in which the column OBJECT_ID values are distributed throughout the whole table.
demo@ORA12C> create table t( id int, object_id int, created date );
 
Table created.
 
demo@ORA12C> insert into t (id, object_id, created)
  2  select id, mod(id,100)+1 , created
  3  from big_table;
 
1000000 rows created.
 
demo@ORA12C> commit;
 
Commit complete.
 
demo@ORA12C> exec dbms_stats.gather_table_stats(user,'T');
 
PL/SQL procedure successfully completed.
 
demo@ORA12C>
 
We can now create a virtual index based on the OBJECT_ID column.
demo@ORA12C> create index t_idx on t(object_id) NOSEGMENT;
 
Index created.
 
demo@ORA12C>
 
Creating a virtual index makes no entry in USER_INDEXES and USER_SEGMENTS dictionary, but mark an entry into USER_OBJECTS.
 
demo@ORA12C> select * from user_segments where segment_name ='T_IDX';
 
no rows selected
 
demo@ORA12C> select * from user_indexes where index_name ='T_IDX';
 
no rows selected
 
demo@ORA12C> select object_name,status from user_objects where object_name ='T_IDX';
 
OBJECT_NAME                    STATUS
------------------------------ -------
T_IDX                          VALID
 
demo@ORA12C>
 
 
Now to verify if the optimizer make use of the virtual index, let’s do an explain plan.
 
 
demo@ORA12C> alter session set "_use_nosegment_indexes"=true;
 
Session altered.
 
demo@ORA12C> set autotrace traceonly explain
demo@ORA12C> select * from t where object_id = 42;
 
Execution Plan
----------------------------------------------------------
Plan hash value: 767293772
 
---------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       | 10000 |   156K|     9   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T     | 10000 |   156K|     9   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T_IDX | 10000 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("OBJECT_ID"=42)
 
demo@ORA12C> set autotrace off
demo@ORA12C> alter session set "_use_nosegment_indexes"=false;
 
Session altered.
 
demo@ORA12C>
 
Indeed the virtual index is costed and used in the explain plan. If we now physically create this index for real then.
 
demo@ORA12C> drop index t_idx;
 
Index dropped.
 
demo@ORA12C> create index t_idx on t(object_id);
 
Index created.
 
demo@ORA12C> column index_name format a10
demo@ORA12C> select i.index_name,i.clustering_factor,t.blocks,t.num_rows
  2  from user_indexes i,
  3       user_tables t
  4  where t.table_name = i.table_name
  5  and i.index_name ='T_IDX';
 
INDEX_NAME CLUSTERING_FACTOR     BLOCKS   NUM_ROWS
---------- ----------------- ---------- ----------
T_IDX                 288917       2890    1000000
 
demo@ORA12C>
demo@ORA12C> set autotrace traceonly explain
demo@ORA12C> select * from t where object_id = 42;
 
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
 
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 10000 |   156K|   796   (2)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    | 10000 |   156K|   796   (2)| 00:00:01 |
--------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("OBJECT_ID"=42)
 
demo@ORA12C> set autotrace off
 
we see the CBO has decided to perform the now cheaper full table scan, Although the Virtual Index on this column was used, once the actual characteristics of the index are determined via the index statistics, the CBO has decided the actual physical index was just too expensive to use to retrieve the 1% of rows.
 
Virtual index is one among many heuristics built into the CBO. Virtual index can be useful to quickly determine whether an index is a viable option if it were to be actually created, however caution need to be exercised.
 
The big advantage of a Virtual Index of course is that it doesn’t really exist and so consumes no storage and can be created extremely quickly/cheaply.
 
The disadvantage of a Virtual index is that it doesn’t really exist and so Oracle cannot collect segment level statistics. Without statistics however, the CBO has a very tough time of doing its job properly …
So don’t use virtual index in live production system, since it is an undocumented feature and also requires an undocumented parameter set at the session level.
But it is safe to play with virtual index in Pre-prod/SIT/UAT/Test database and see if the real index benefit the workload. 
 
Always have the real index in the live production system but not the virtual index.

Wednesday, February 14, 2018

online stats gathering for partitioned Tables

One of the nice feature introduced in Oracle 12c (12.1.0.1) is the ability to gather online stats for tables during bulk loads such as CREATE TABLE AS SELECT operation or Direct path insert into an empty table.
 
demo@ORA12C> create table t
  2  partition by range( created )
  3     interval( numtoyminterval(1,'year') )
  4  ( partition p_2013 values less than
  5     ( to_date('01-Jan-2014','dd-mon-yyyy') ) )
  6  as
  7  select owner,object_name,object_id,created
  8  from all_objects;
 
Table created.
 
demo@ORA12C> select num_rows,blocks,last_analyzed
  2  from user_Tables
  3  where table_name ='T';
 
  NUM_ROWS     BLOCKS LAST_ANALYZED
---------- ---------- -----------------------
     61521        550 12-FEB-2018 12:53:31 pm
 
demo@ORA12C> select column_name,num_distinct,last_analyzed
  2  from user_tab_col_statistics
  3  where table_name ='T';
 
COLUMN_NAME          NUM_DISTINCT LAST_ANALYZED
-------------------- ------------ -----------------------
OWNER                          20 12-FEB-2018 12:53:31 pm
OBJECT_NAME                 51692 12-FEB-2018 12:53:31 pm
OBJECT_ID                   61521 12-FEB-2018 12:53:31 pm
CREATED                      1294 12-FEB-2018 12:53:31 pm
 
demo@ORA12C>
 
So as part of data load, the database is able to gather the stats on this object, this improves the performance and manageability of bulk load operations by eliminating the user intervention to gather statistics after the load by removing an additional full table scan required for separate statistics gathering operations.
 
Since the online stats gathering was designed to work with minimal impacts to direct path operation, database by default don’t gather partition level statistics.
 
 
demo@ORA12C> select partition_name,num_rows,blocks,last_analyzed
  2  from user_tab_partitions
  3  where table_name ='T';
 
PARTITION_   NUM_ROWS     BLOCKS LAST_ANALYZED
---------- ---------- ---------- -----------------------
P_2013
SYS_P10261
SYS_P10281
 
demo@ORA12C> select partition_name,column_name,num_distinct,last_analyzed
  2  from user_part_col_statistics
  3  where table_name ='T';
 
PARTITION_ COLUMN_NAME          NUM_DISTINCT LAST_ANALYZED
---------- -------------------- ------------ -----------------------
P_2013     OWNER
P_2013     OBJECT_NAME
P_2013     OBJECT_ID
P_2013     CREATED
SYS_P10261 OWNER
SYS_P10261 OBJECT_NAME
SYS_P10261 OBJECT_ID
SYS_P10261 CREATED
SYS_P10281 OWNER
SYS_P10281 OBJECT_NAME
SYS_P10281 OBJECT_ID
SYS_P10281 CREATED
 
12 rows selected.
 
demo@ORA12C>
 
So just to build the partition level statistics in this case, we can make use of the option GATHER AUTO without updating the global statistics. Using GATHER AUTO means we are asking oracle to gather only the missing statistics in this table (which in this case is a partitioned level statistics).
 
demo@ORA12C> exec dbms_stats.gather_table_stats(user,'T',options=>'GATHER AUTO');
 
PL/SQL procedure successfully completed.
 
demo@ORA12C> select num_rows,blocks,last_analyzed
  2  from user_Tables
  3  where table_name ='T';
 
  NUM_ROWS     BLOCKS LAST_ANALYZED
---------- ---------- -----------------------
     61521        550 12-FEB-2018 12:53:31 pm
 
demo@ORA12C> select column_name,num_distinct,last_analyzed
  2  from user_tab_col_statistics
  3  where table_name ='T';
 
COLUMN_NAME          NUM_DISTINCT LAST_ANALYZED
-------------------- ------------ -----------------------
OWNER                          20 12-FEB-2018 12:53:31 pm
OBJECT_NAME                 51692 12-FEB-2018 12:53:31 pm
OBJECT_ID                   61521 12-FEB-2018 12:53:31 pm
CREATED                      1294 12-FEB-2018 12:53:31 pm
 
demo@ORA12C> select partition_name,num_rows,blocks,last_analyzed
  2  from user_tab_partitions
  3  where table_name ='T';
 
PARTITION_   NUM_ROWS     BLOCKS LAST_ANALYZED
---------- ---------- ---------- -----------------------
P_2013              0          0 12-FEB-2018 01:20:37 pm
SYS_P10261      61503        531 12-FEB-2018 01:20:38 pm
SYS_P10281         18         19 12-FEB-2018 01:20:38 pm
 
demo@ORA12C> select partition_name,column_name,num_distinct,last_analyzed
  2  from user_part_col_statistics
  3  where table_name ='T';
 
PARTITION_ COLUMN_NAME          NUM_DISTINCT LAST_ANALYZED
---------- -------------------- ------------ -----------------------
P_2013     OWNER                           0 12-FEB-2018 01:20:37 pm
P_2013     OBJECT_NAME                     0 12-FEB-2018 01:20:37 pm
P_2013     OBJECT_ID                       0 12-FEB-2018 01:20:37 pm
P_2013     CREATED                         0 12-FEB-2018 01:20:37 pm
SYS_P10261 OWNER                          20 12-FEB-2018 01:20:37 pm
SYS_P10261 OBJECT_NAME                 51676 12-FEB-2018 01:20:37 pm
SYS_P10261 OBJECT_ID                   61503 12-FEB-2018 01:20:37 pm
SYS_P10261 CREATED                      1278 12-FEB-2018 01:20:37 pm
SYS_P10281 OWNER                           1 12-FEB-2018 01:20:38 pm
SYS_P10281 OBJECT_NAME                    17 12-FEB-2018 01:20:38 pm
SYS_P10281 OBJECT_ID                      18 12-FEB-2018 01:20:38 pm
SYS_P10281 CREATED                        16 12-FEB-2018 01:20:38 pm
 
12 rows selected.
 
demo@ORA12C>
 

Wednesday, February 7, 2018

on Count

Hearing a lot about count(1) Vs count(*) misconception – so thought of writing this blog post to break all those misleading ideas.
 
Technically count(1) and count(*) are the same, nothing different.
 
Count(*) counts records in a table. (it does not  have to get the full record, it just need to know there is a record and increments a count)
 
Count(1) counts not null occurrence of the constant 1 in a table (it does not have to get the full record, it just need to know there is a record and increments a count)
 
They are same, identical, not different.
 
Tkprof shows the same.
 
select count(*)
from
 big_table
 
 
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.04          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.28       2.63      15204      15213          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.28       2.68      15204      15213          0           1
 
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 91 
Number of plan statistics captured: 1
 
Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=15213 pr=15204 pw=0 time=0 us starts=2635299)
   1000000    1000000    1000000   TABLE ACCESS FULL BIG_TABLE (cr=15213 pr=15204 pw=0 time=4220 us starts=878850 cost=4220 size=0 card=1000000)
 
 
Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  Disk file operations I/O                        1        0.07          0.07
  direct path read                              182        0.10          2.32
  SQL*Net message from client                     2        0.00          0.00
********************************************************************************
 
SQL ID: bua52c07ys6bc Plan Hash: 599409829
 
select count(1)
from
 big_table
 
 
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.20       2.55      15204      15209          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.20       2.56      15204      15209          0           1
 
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 91 
Number of plan statistics captured: 1
 
Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=15209 pr=15204 pw=0 time=0 us starts=2559338)
   1000000    1000000    1000000   TABLE ACCESS FULL BIG_TABLE (cr=15209 pr=15204 pw=0 time=4220 us starts=812760 cost=4220 size=0 card=1000000)
 
 
Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  direct path read                              164        0.18          2.31
  SQL*Net message from client                     2        0.00          0.00
 
 
 
********************************************************************************
 
 
Also 10053 trace file shows that count(1) is silently translated into count(*)
 
rajesh@ORA11G> @tkfilename
D:\APP\VNAMEIT\diag\rdbms\ora11g\ora11g\trace\ora11g_ora_10680.trc
rajesh@ORA11G> @10053
 
Session altered.
 
rajesh@ORA11G>
rajesh@ORA11G> explain plan for
  2  select count(1)
  3  from demo.big_table;
 
Explained.
 
rajesh@ORA11G> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
 
C:\Users\179818>
 
 
Looking into 10053 trace show this.
 
*******************************************
Peeked values of the binds in SQL statement
*******************************************
 
Final query after transformations:******* UNPARSED QUERY IS *******
SELECT COUNT(*) "COUNT(1)" FROM "DEMO"."BIG_TABLE" "BIG_TABLE"
kkoqbc: optimizing query block SEL$1 (#0)
 
This confirms that count (1) silently turned into count (*).
 
Count(any_thing) is meaningless.
 
So the right way to count the number of rows in an object (table/materialized view/(sub)partition) is count(*), period .