One of many potential capabilities of
Oracle database 21c was the ability of auto indexing (AI) of JSON expressions.
Will start by creating a simple table that uses the new JSON datatype
(introduced in Oracle 21c – the following example was run from Oracle
Autonomous database cloud service having Oracle 21c(21.3) ) with some JSON
documents loaded into it.
 
ai_demo@ATP21C> create
table demo032
  2 
nologging as
  3 
select rownum as id,
  4    a.created as created_dt,
  5    json_object(
  6    'OWNER' value a.OWNER ,
  7    'OBJECT_NAME' value a.OBJECT_NAME ,
  8    'SUBOBJECT_NAME' value a.SUBOBJECT_NAME ,
  9    'OBJECT_ID' value a.OBJECT_ID ,
 10    'OBJECT_NEW_ID' value rownum ,
 11    'DATA_OBJECT_ID' value a.DATA_OBJECT_ID ,
 12    'OBJECT_TYPE' value a.OBJECT_TYPE ,
 13    'CREATED' value a.CREATED ,
 14    'LAST_DDL_TIME' value a.LAST_DDL_TIME ,
 15    'TIMESTAMP' value a.TIMESTAMP ,
 16    'STATUS' value a.STATUS returning JSON ) my_json_data
 17  from
stage a, stage b
 18 
where rownum <= 10000000 ;
 
Table created.
 
Then I run a few SQL statements with
different JSON expression-based predicates, like this
 
ai_demo@ATP21C> set
autotrace traceonly exp statistics
ai_demo@ATP21C> select *
from demo032 where json_value(my_json_data,'$.OBJECT_NEW_ID') = '55';
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3377958815
 
-------------------------------------------------------------------------------------
| Id  | Operation                 | Name    | Rows 
| Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |         |  
100K|    31M|   130K 
(1)| 00:00:06 |
|*  1 | 
TABLE ACCESS STORAGE FULL| DEMO032 |  
100K|    31M|   130K 
(1)| 00:00:06 |
-------------------------------------------------------------------------------------
 
Predicate Information
(identified by operation id):
---------------------------------------------------
 
   1 -
storage(JSON_VALUE("MY_JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON ,
              '$.OBJECT_NEW_ID' RETURNING
VARCHAR2(4000) NULL ON ERROR)='55')
      
filter(JSON_VALUE("MY_JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON ,
              '$.OBJECT_NEW_ID' RETURNING
VARCHAR2(4000) NULL ON ERROR)='55')
 
Note
-----
   - automatic DOP: Computed Degree of
Parallelism is 1
 
 
Statistics
----------------------------------------------------------
          0 
recursive calls
          0 
db block gets
     478790 
consistent gets
     478784 
physical reads
          0 
redo size
        960 
bytes sent via SQL*Net to client
        433 
bytes received via SQL*Net from client
          3 
SQL*Net roundtrips to/from client
          0 
sorts (memory)
          0 
sorts (disk)
          1 
rows processed
 
ai_demo@ATP21C> select *
from demo032 where json_value(my_json_data,'$.OBJECT_NEW_ID' returning number)
= 55;
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3377958815
 
-------------------------------------------------------------------------------------
| Id  | Operation                 | Name    | Rows 
| Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |         |  
100K|    31M|   130K 
(1)| 00:00:06 |
|*  1 | 
TABLE ACCESS STORAGE FULL| DEMO032 |  
100K|    31M|   130K 
(1)| 00:00:06 |
-------------------------------------------------------------------------------------
 
Predicate Information
(identified by operation id):
---------------------------------------------------
 
   1 -
storage(JSON_VALUE("MY_JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON ,
              '$.OBJECT_NEW_ID' RETURNING
NUMBER NULL ON ERROR)=55)
      
filter(JSON_VALUE("MY_JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON ,
              '$.OBJECT_NEW_ID' RETURNING
NUMBER NULL ON ERROR)=55)
 
Note
-----
   - automatic DOP: Computed Degree of
Parallelism is 1
 
 
Statistics
----------------------------------------------------------
          0 
recursive calls
          0 
db block gets
     478790 
consistent gets
     478784 
physical reads
          0 
redo size
        960 
bytes sent via SQL*Net to client
        433 
bytes received via SQL*Net from client
          3 
SQL*Net roundtrips to/from client
          0 
sorts (memory)
          0 
sorts (disk)
          1 
rows processed
 
ai_demo@ATP21C> select *
from demo032 t1 where t1.my_json_data.OBJECT_NEW_ID.number() = 55;
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3377958815
 
-------------------------------------------------------------------------------------
| Id  | Operation                 | Name    | Rows 
| Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |         |  
100K|    31M|   130K 
(1)| 00:00:06 |
|*  1 | 
TABLE ACCESS STORAGE FULL| DEMO032 |  
100K|    31M|   130K 
(1)| 00:00:06 |
-------------------------------------------------------------------------------------
 
Predicate Information
(identified by operation id):
---------------------------------------------------
 
   1 -
storage(JSON_VALUE("T1"."MY_JSON_DATA" /*+ LOB_BY_VALUE
*/  FORMAT
              OSON , '$.OBJECT_NEW_ID.number()'
RETURNING NUMBER NULL ON ERROR)=55)
      
filter(JSON_VALUE("T1"."MY_JSON_DATA" /*+
LOB_BY_VALUE */  FORMAT
              OSON , '$.OBJECT_NEW_ID.number()'
RETURNING NUMBER NULL ON ERROR)=55)
 
Note
-----
   - automatic DOP: Computed Degree of
Parallelism is 1
 
 
Statistics
----------------------------------------------------------
          0 
recursive calls
          0 
db block gets
     478790 
consistent gets
     478784 
physical reads
          0 
redo size
        960 
bytes sent via SQL*Net to client
        433 
bytes received via SQL*Net from client
          3 
SQL*Net roundtrips to/from client
          0 
sorts (memory)
          0 
sorts (disk)
          1 
rows processed
 
ai_demo@ATP21C> set
autotrace off
 
they all return just one row, but most
currently use full table scan with no indexes present.
  
So, what does AI make of things?
  
The first thing to note is that running
the AI report generates the following error.
 
ai_demo@ATP21C> set echo
on linesize 1000 trimspool on trim on long 2000000 longchunksize 2000000
ai_demo@ATP21C> col report
for a180
ai_demo@ATP21C> select
dbms_auto_index.report_activity(systimestamp -1/24,
systimestamp,'TEXT','ALL','ALL') report from dual;
ERROR:
ORA-30954: char 0 is invalid
in json_value(MY_JSON_DATA, '$.OBJECT_NEW_ID' returni
ORA-06512: at
"SYS.DBMS_AUTO_INDEX", line 107
ORA-06512: at
"SYS.DBMS_AUTO_INDEX_INTERNAL", line 8686
ORA-06512: at
"SYS.DBMS_AUTO_INDEX_INTERNAL", line 8686
ORA-06512: at
"SYS.DBMS_AUTO_INDEX_INTERNAL", line 9236
ORA-06512: at
"SYS.DBMS_AUTO_INDEX", line 89
ORA-06512: at line 1
 
 
 
no rows selected
 
If we look at the indexes now present in
the table
  
 
ai_demo@ATP21C> select
table_name,index_name,index_type,indexing,auto,
  2         
status,visibility
   3  from
all_indexes
  4 
where table_name = 'DEMO032'
  5  and
owner ='AI_DEMO';
 
TABLE_NAME INDEX_NAME                INDEX_TYPE                  INDEXIN AUT STATUS   VISIBILIT
----------
------------------------- --------------------------- ------- --- --------
---------
DEMO032    SYS_AI_fwkyzzbd70x9s      FUNCTION-BASED NORMAL       FULL   
YES VALID    VISIBLE 
 DEMO032    SYS_AI_17jynda4v9sxu      FUNCTION-BASED NORMAL       FULL   
YES VALID    VISIBLE 
  
ai_demo@ATP21C> select
table_name,index_name,column_name,column_position
  2  from
user_ind_columns
  3  where
table_name = 'DEMO032'
  4 
order by 1,2;
 
TABLE_NAME INDEX_NAME                COLUMN_NAME     COLUMN_POSITION
----------
------------------------- --------------- ---------------
DEMO032    SYS_AI_17jynda4v9sxu      SYS_NC00005$                  1
DEMO032    SYS_AI_fwkyzzbd70x9s      SYS_NC00004$                  1
 
ai_demo@ATP21C> col
column_expression for a60
ai_demo@ATP21C> select
index_name,column_expression
  2  from
user_ind_expressions
  3 
where table_name = 'DEMO032';
 
INDEX_NAME                COLUMN_EXPRESSION
-------------------------
------------------------------------------------------------
SYS_AI_17jynda4v9sxu      JSON_VALUE("MY_JSON_DATA"
FORMAT OSON , '$.OBJECT_NEW_ID' RE
                          TURNING NUMBER ERROR
ON ERROR NULL ON EMPTY)
 
SYS_AI_fwkyzzbd70x9s      JSON_VALUE("MY_JSON_DATA"
FORMAT OSON , '$.OBJECT_NEW_ID' RE
                          TURNING
VARCHAR2(4000) ERROR ON ERROR NULL ON EMPTY)
 
 
We can see that AI has indeed created two
new automatic indexes, one on the varchar2 expression and other on numeric
expression. If we re-run the sql’s we notice few important points, note the
following was run after the automatic indexes were created.
 
ai_demo@ATP21C> set
autotrace traceonly exp
ai_demo@ATP21C> select *
from demo032 where json_value(my_json_data,'$.OBJECT_NEW_ID') = '55';
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2321837570
 
------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |   100K|   
31M|  9154   (1)| 00:00:01 |
|   1 | 
TABLE ACCESS BY INDEX ROWID BATCHED| DEMO032              |   100K|   
31M|  9154   (1)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | SYS_AI_fwkyzzbd70x9s |
40000 |       |     3  
(0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
 
Predicate Information
(identified by operation id):
---------------------------------------------------
 
   2 -
access(JSON_VALUE("MY_JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON , '$.OBJECT_NEW_ID'
              RETURNING VARCHAR2(4000) ERROR ON
ERROR NULL ON EMPTY)='55')
 
Note
-----
   - automatic DOP: Computed Degree of
Parallelism is 1
 
ai_demo@ATP21C> select *
from demo032 where json_value(my_json_data,'$.OBJECT_NEW_ID' returning number)
= 55;
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2305263965
 
------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |   100K|   
31M|  1920   (1)| 00:00:01 |
|   1 | 
TABLE ACCESS BY INDEX ROWID BATCHED| DEMO032              |   100K|   
31M|  1920   (1)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | SYS_AI_17jynda4v9sxu |
40000 |       |     3  
(0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
 
Predicate Information
(identified by operation id):
---------------------------------------------------
 
   2 -
access(JSON_VALUE("MY_JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON , '$.OBJECT_NEW_ID'
              RETURNING NUMBER ERROR ON ERROR
NULL ON EMPTY)=55)
 
Note
-----
   - automatic DOP: Computed Degree of
Parallelism is 1
 
ai_demo@ATP21C> select *
from demo032 t1 where t1.my_json_data.OBJECT_NEW_ID.number() = 55;
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2305263965
 
------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |   100K|   
31M|  1920   (1)| 00:00:01 |
|   1 | 
TABLE ACCESS BY INDEX ROWID BATCHED| DEMO032              |   100K|   
31M|  1920   (1)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | SYS_AI_17jynda4v9sxu |
40000 |       |     3  
(0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
 
Predicate Information
(identified by operation id):
---------------------------------------------------
 
   2 -
access(JSON_VALUE("MY_JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON , '$.OBJECT_NEW_ID'
              RETURNING NUMBER ERROR ON ERROR
NULL ON EMPTY)=55)
 
Note
-----
   - automatic DOP: Computed Degree of
Parallelism is 1
 
ai_demo@ATP21C> set
autotrace off
 
 
the first point to note is that the CBO
now uses the newly created AI, as only one row is returned. In the next few blogposts we will see about the anomalies from the above plan.