One of many potential capabilities of
Oracle database 21c was the ability of auto indexing (AI) of JSON expressions.
Will start by creating a simple table that uses the new JSON datatype
(introduced in Oracle 21c – the following example was run from Oracle
Autonomous database cloud service having Oracle 21c(21.3) ) with some JSON
documents loaded into it.
ai_demo@ATP21C> create
table demo032
2
nologging as
3
select rownum as id,
4 a.created as created_dt,
5 json_object(
6 'OWNER' value a.OWNER ,
7 'OBJECT_NAME' value a.OBJECT_NAME ,
8 'SUBOBJECT_NAME' value a.SUBOBJECT_NAME ,
9 'OBJECT_ID' value a.OBJECT_ID ,
10 'OBJECT_NEW_ID' value rownum ,
11 'DATA_OBJECT_ID' value a.DATA_OBJECT_ID ,
12 'OBJECT_TYPE' value a.OBJECT_TYPE ,
13 'CREATED' value a.CREATED ,
14 'LAST_DDL_TIME' value a.LAST_DDL_TIME ,
15 'TIMESTAMP' value a.TIMESTAMP ,
16 'STATUS' value a.STATUS returning JSON ) my_json_data
17 from
stage a, stage b
18
where rownum <= 10000000 ;
Table created.
Then I run a few SQL statements with
different JSON expression-based predicates, like this
ai_demo@ATP21C> set
autotrace traceonly exp statistics
ai_demo@ATP21C> select *
from demo032 where json_value(my_json_data,'$.OBJECT_NEW_ID') = '55';
Execution Plan
----------------------------------------------------------
Plan hash value: 3377958815
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows
| Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |
100K| 31M| 130K
(1)| 00:00:06 |
|* 1 |
TABLE ACCESS STORAGE FULL| DEMO032 |
100K| 31M| 130K
(1)| 00:00:06 |
-------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
1 -
storage(JSON_VALUE("MY_JSON_DATA" /*+ LOB_BY_VALUE */ FORMAT OSON ,
'$.OBJECT_NEW_ID' RETURNING
VARCHAR2(4000) NULL ON ERROR)='55')
filter(JSON_VALUE("MY_JSON_DATA" /*+ LOB_BY_VALUE */ FORMAT OSON ,
'$.OBJECT_NEW_ID' RETURNING
VARCHAR2(4000) NULL ON ERROR)='55')
Note
-----
- automatic DOP: Computed Degree of
Parallelism is 1
Statistics
----------------------------------------------------------
0
recursive calls
0
db block gets
478790
consistent gets
478784
physical reads
0
redo size
960
bytes sent via SQL*Net to client
433
bytes received via SQL*Net from client
3
SQL*Net roundtrips to/from client
0
sorts (memory)
0
sorts (disk)
1
rows processed
ai_demo@ATP21C> select *
from demo032 where json_value(my_json_data,'$.OBJECT_NEW_ID' returning number)
= 55;
Execution Plan
----------------------------------------------------------
Plan hash value: 3377958815
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows
| Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |
100K| 31M| 130K
(1)| 00:00:06 |
|* 1 |
TABLE ACCESS STORAGE FULL| DEMO032 |
100K| 31M| 130K
(1)| 00:00:06 |
-------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
1 -
storage(JSON_VALUE("MY_JSON_DATA" /*+ LOB_BY_VALUE */ FORMAT OSON ,
'$.OBJECT_NEW_ID' RETURNING
NUMBER NULL ON ERROR)=55)
filter(JSON_VALUE("MY_JSON_DATA" /*+ LOB_BY_VALUE */ FORMAT OSON ,
'$.OBJECT_NEW_ID' RETURNING
NUMBER NULL ON ERROR)=55)
Note
-----
- automatic DOP: Computed Degree of
Parallelism is 1
Statistics
----------------------------------------------------------
0
recursive calls
0
db block gets
478790
consistent gets
478784
physical reads
0
redo size
960
bytes sent via SQL*Net to client
433
bytes received via SQL*Net from client
3
SQL*Net roundtrips to/from client
0
sorts (memory)
0
sorts (disk)
1
rows processed
ai_demo@ATP21C> select *
from demo032 t1 where t1.my_json_data.OBJECT_NEW_ID.number() = 55;
Execution Plan
----------------------------------------------------------
Plan hash value: 3377958815
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows
| Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |
100K| 31M| 130K
(1)| 00:00:06 |
|* 1 |
TABLE ACCESS STORAGE FULL| DEMO032 |
100K| 31M| 130K
(1)| 00:00:06 |
-------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
1 -
storage(JSON_VALUE("T1"."MY_JSON_DATA" /*+ LOB_BY_VALUE
*/ FORMAT
OSON , '$.OBJECT_NEW_ID.number()'
RETURNING NUMBER NULL ON ERROR)=55)
filter(JSON_VALUE("T1"."MY_JSON_DATA" /*+
LOB_BY_VALUE */ FORMAT
OSON , '$.OBJECT_NEW_ID.number()'
RETURNING NUMBER NULL ON ERROR)=55)
Note
-----
- automatic DOP: Computed Degree of
Parallelism is 1
Statistics
----------------------------------------------------------
0
recursive calls
0
db block gets
478790
consistent gets
478784
physical reads
0
redo size
960
bytes sent via SQL*Net to client
433
bytes received via SQL*Net from client
3
SQL*Net roundtrips to/from client
0
sorts (memory)
0
sorts (disk)
1
rows processed
ai_demo@ATP21C> set
autotrace off
they all return just one row, but most
currently use full table scan with no indexes present.
So, what does AI make of things?
The first thing to note is that running
the AI report generates the following error.
ai_demo@ATP21C> set echo
on linesize 1000 trimspool on trim on long 2000000 longchunksize 2000000
ai_demo@ATP21C> col report
for a180
ai_demo@ATP21C> select
dbms_auto_index.report_activity(systimestamp -1/24,
systimestamp,'TEXT','ALL','ALL') report from dual;
ERROR:
ORA-30954: char 0 is invalid
in json_value(MY_JSON_DATA, '$.OBJECT_NEW_ID' returni
ORA-06512: at
"SYS.DBMS_AUTO_INDEX", line 107
ORA-06512: at
"SYS.DBMS_AUTO_INDEX_INTERNAL", line 8686
ORA-06512: at
"SYS.DBMS_AUTO_INDEX_INTERNAL", line 8686
ORA-06512: at
"SYS.DBMS_AUTO_INDEX_INTERNAL", line 9236
ORA-06512: at
"SYS.DBMS_AUTO_INDEX", line 89
ORA-06512: at line 1
no rows selected
If we look at the indexes now present in
the table
ai_demo@ATP21C> select
table_name,index_name,index_type,indexing,auto,
2
status,visibility
3 from
all_indexes
4
where table_name = 'DEMO032'
5 and
owner ='AI_DEMO';
TABLE_NAME INDEX_NAME INDEX_TYPE INDEXIN AUT STATUS VISIBILIT
----------
------------------------- --------------------------- ------- --- --------
---------
DEMO032 SYS_AI_fwkyzzbd70x9s FUNCTION-BASED NORMAL FULL
YES VALID VISIBLE
DEMO032 SYS_AI_17jynda4v9sxu FUNCTION-BASED NORMAL FULL
YES VALID VISIBLE
ai_demo@ATP21C> select
table_name,index_name,column_name,column_position
2 from
user_ind_columns
3 where
table_name = 'DEMO032'
4
order by 1,2;
TABLE_NAME INDEX_NAME COLUMN_NAME COLUMN_POSITION
----------
------------------------- --------------- ---------------
DEMO032 SYS_AI_17jynda4v9sxu SYS_NC00005$ 1
DEMO032 SYS_AI_fwkyzzbd70x9s SYS_NC00004$ 1
ai_demo@ATP21C> col
column_expression for a60
ai_demo@ATP21C> select
index_name,column_expression
2 from
user_ind_expressions
3
where table_name = 'DEMO032';
INDEX_NAME COLUMN_EXPRESSION
-------------------------
------------------------------------------------------------
SYS_AI_17jynda4v9sxu JSON_VALUE("MY_JSON_DATA"
FORMAT OSON , '$.OBJECT_NEW_ID' RE
TURNING NUMBER ERROR
ON ERROR NULL ON EMPTY)
SYS_AI_fwkyzzbd70x9s JSON_VALUE("MY_JSON_DATA"
FORMAT OSON , '$.OBJECT_NEW_ID' RE
TURNING
VARCHAR2(4000) ERROR ON ERROR NULL ON EMPTY)
We can see that AI has indeed created two
new automatic indexes, one on the varchar2 expression and other on numeric
expression. If we re-run the sql’s we notice few important points, note the
following was run after the automatic indexes were created.
ai_demo@ATP21C> set
autotrace traceonly exp
ai_demo@ATP21C> select *
from demo032 where json_value(my_json_data,'$.OBJECT_NEW_ID') = '55';
Execution Plan
----------------------------------------------------------
Plan hash value: 2321837570
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K|
31M| 9154 (1)| 00:00:01 |
| 1 |
TABLE ACCESS BY INDEX ROWID BATCHED| DEMO032 | 100K|
31M| 9154 (1)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | SYS_AI_fwkyzzbd70x9s |
40000 | | 3
(0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
2 -
access(JSON_VALUE("MY_JSON_DATA" /*+ LOB_BY_VALUE */ FORMAT OSON , '$.OBJECT_NEW_ID'
RETURNING VARCHAR2(4000) ERROR ON
ERROR NULL ON EMPTY)='55')
Note
-----
- automatic DOP: Computed Degree of
Parallelism is 1
ai_demo@ATP21C> select *
from demo032 where json_value(my_json_data,'$.OBJECT_NEW_ID' returning number)
= 55;
Execution Plan
----------------------------------------------------------
Plan hash value: 2305263965
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K|
31M| 1920 (1)| 00:00:01 |
| 1 |
TABLE ACCESS BY INDEX ROWID BATCHED| DEMO032 | 100K|
31M| 1920 (1)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | SYS_AI_17jynda4v9sxu |
40000 | | 3
(0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
2 -
access(JSON_VALUE("MY_JSON_DATA" /*+ LOB_BY_VALUE */ FORMAT OSON , '$.OBJECT_NEW_ID'
RETURNING NUMBER ERROR ON ERROR
NULL ON EMPTY)=55)
Note
-----
- automatic DOP: Computed Degree of
Parallelism is 1
ai_demo@ATP21C> select *
from demo032 t1 where t1.my_json_data.OBJECT_NEW_ID.number() = 55;
Execution Plan
----------------------------------------------------------
Plan hash value: 2305263965
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K|
31M| 1920 (1)| 00:00:01 |
| 1 |
TABLE ACCESS BY INDEX ROWID BATCHED| DEMO032 | 100K|
31M| 1920 (1)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | SYS_AI_17jynda4v9sxu |
40000 | | 3
(0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
2 -
access(JSON_VALUE("MY_JSON_DATA" /*+ LOB_BY_VALUE */ FORMAT OSON , '$.OBJECT_NEW_ID'
RETURNING NUMBER ERROR ON ERROR
NULL ON EMPTY)=55)
Note
-----
- automatic DOP: Computed Degree of
Parallelism is 1
ai_demo@ATP21C> set
autotrace off
the first point to note is that the CBO
now uses the newly created AI, as only one row is returned. In the next few blogposts we will see about the anomalies from the above plan.