Wednesday, January 21, 2026

Indexing JSON Arrays in Oracle Database - Part II

In the previous blog post, we explored how JSON array elements can be indexed using materialized views and relational flattening. While that approach delivers excellent performance, it comes at a cost — additional storage, maintenance overhead, and schema complexity.
 
Starting with Oracle Database 21c and above, Oracle introduced a much more elegant and native solution for indexing JSON arrays:
👉 Multi-Value Indexes
 
Multi-value indexes allow Oracle to directly index JSON array elements — without flattening the data, without materialized views, and without rewriting your schema.
 
In this post, we will explore: 
  • What multi-value indexes are
  • How they improve JSON array search performance
  • How they can index nested sibling arrays
  • How to handle queries on specific array positions 
Traditional B-tree and function-based indexes expect one value per row. But JSON arrays naturally contain multiple values per document.
 
A multi-value index solves this mismatch by allowing multiple index entries per row, one for each array element.
 
This allows Oracle to treat JSON arrays as first-class indexable structures.
 
Let’s start by indexing the products[*].id field from our JSON document.
 
demo@ADB26ai> create multivalue index my_demo_idx on my_demo(
  2      json_table( c2 ,'$.products[*]'
  3          error on error null on empty null on mismatch
  4          columns ( object_id number path '$.id')
  5      )
  6  );
 
Index created.
 
This creates a native multi-value index — one index entry per product ID inside every JSON document. Now run the same query we used earlier:
 
demo@ADB26ai> select *
  2  from my_demo
  3  where json_exists( c2, '$.products[*]?(@.id.number() == 55 )' );
 
122 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3662244578
 
-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                            |     2 |  1456 |     4   (0)| 00:00:01 |
|   1 |  RESULT CACHE                        | bhn66uu5b6guf8qzf1w80zdqqr |     2 |  1456 |     4   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| MY_DEMO                    |     2 |  1456 |     4   (0)| 00:00:01 |
|   3 |    SORT CLUSTER BY ROWID BATCHED     |                            |     2 |       |     3   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN (MULTI VALUE)   | MY_DEMO_IDX                |     2 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access(JSON_QUERY("C2" /*+ LOB_BY_VALUE */  FORMAT OSON , '$.products[*].id' RETURNING NUMBER ASIS
              WITHOUT ARRAY WRAPPER ERROR ON ERROR NULL ON EMPTY NULL ON MISMATCH TYPE(LAX)  MULTIVALUE)=55)
 
Result Cache Information (identified by operation id):
------------------------------------------------------
 
   1 - column-count=2; dependencies=(DEMO.MY_DEMO); name="select *
from my_demo
where json_exists( c2, '$.products[*]?(@.id.number() == 55 )' )"
 
 
Note
-----
   - dynamic statistics used: dynamic sampling (level=AUTO (SYSTEM))
   - automatic DOP: Computed Degree of Parallelism is 1 because of no expensive parallel operation
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         18  consistent gets
          0  physical reads
          0  redo size
      92501  bytes sent via SQL*Net to client
        401  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         25  sorts (memory)
          0  sorts (disk)
        122  rows processed
 
 
This multi-value index is much more efficient than the materialized view based solutions, Direct index access into JSON arrays and Only 18 consistent gets
 
One of the most powerful features of multi-value indexes is their ability to index sibling arrays inside JSON structures.
 
In our JSON document, each product has a nested dimension
array:
 
{
  "id": 1,
  "products": [
    {
      "id": 55,
      "title": "Charger SXT RWD",
      "price": 32999.99,
      "quantity": 3,
      "dimension": [935, 412, 120, 88]
    },
    {
      "id": 56,
      "title": "Apple MacBook Pro 14 Inch Space Grey",
      "price": 1999.99,
      "quantity": 2,
      "dimension": [210, 330, 40, 25]
    }
  ],
  "total": 103774.85
}
 
 
We can extend the multi-value index to include both id and dimension:
 
demo@ADB26ai> drop index my_demo_idx ;
 
Index dropped.
 
demo@ADB26ai> create multivalue index my_demo_idx on my_demo(
  2      json_table( c2 ,'$.products[*]'
  3          error on error null on empty null on mismatch
  4          columns ( object_id number path '$.id' ,
  5                     nested path dimension[*]
  6                     columns( dim number path '$')
  7             )
  8      )
  9  );
 
Index created.
 
Now that the dimension array inside the products array is indexed, queries that filter on both product attributes and dimension values can efficiently leverage the multi-value index.
 
demo@ADB26ai> select * from my_demo
  2  where json_exists( c2, '$.products[*]?(@.id.number() == 55
  3     && @.dimension[*].number() == 662 )' ) ;
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3662244578
 
-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                            |     1 |   728 |     2   (0)| 00:00:01 |
|   1 |  RESULT CACHE                        | 4888wgv2m25hdfrrua1uxadh6y |     1 |   728 |     2   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| MY_DEMO                    |     1 |   728 |     2   (0)| 00:00:01 |
|   3 |    SORT CLUSTER BY ROWID BATCHED     |                            |     1 |       |     2   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN (MULTI VALUE)   | MY_DEMO_IDX                |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access(JSON_QUERY("C2" /*+ LOB_BY_VALUE */  FORMAT OSON , '$.products[*].id' RETURNING NUMBER ASIS
              WITHOUT ARRAY WRAPPER ERROR ON ERROR NULL ON EMPTY NULL ON MISMATCH TYPE(LAX)  MULTIVALUE)=55 AND
              JSON_QUERY("C2" /*+ LOB_BY_VALUE */  FORMAT OSON , '$.products[*].dimension[*]' RETURNING NUMBER ASIS
              WITHOUT ARRAY WRAPPER ERROR ON ERROR NULL ON EMPTY NULL ON MISMATCH TYPE(LAX)  MULTIVALUE)=662)
 
Result Cache Information (identified by operation id):
------------------------------------------------------
 
   1 - column-count=2; dependencies=(DEMO.MY_DEMO); name="select * from my_demo
where json_exists( c2, '$.products[*]?(@.id.number() == 55
        && @.dimension[*].number() == 662 )' ) "
 
 
Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1 because of no expensive parallel operation
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         10  consistent gets
          0  physical reads
          0  redo size
       1239  bytes sent via SQL*Net to client
        405  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1    rows processed
 
This query Uses the multi-value index Avoids scanning JSON documents Executes with only 10 consistent gets Returns results in milliseconds, However the above multi value index wont be efficient for Queries targeting specific array position
 
demo@ADB26ai> select * from my_demo where json_exists( c2, '$.products[*]?(@.id.number() == 55
  2     && @.dimension[0].number() == 662 )' ) ;
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3662244578
 
-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                            |     1 |   728 |     2   (0)| 00:00:01 |
|   1 |  RESULT CACHE                        | 3kjsv3q8tq4sy40thmuuv5n384 |     1 |   728 |     2   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS BY INDEX ROWID BATCHED| MY_DEMO                    |     1 |   728 |     2   (0)| 00:00:01 |
|   3 |    SORT CLUSTER BY ROWID BATCHED     |                            |     1 |       |     2   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN (MULTI VALUE)   | MY_DEMO_IDX                |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter(JSON_EXISTS2("C2" /*+ LOB_BY_VALUE */  FORMAT OSON , '$.products[*]?(@.id.number() == 55  &&
              @.dimension[0].number() == 662 )' /* json_path_str  $.products[*]?((@.id.number() == 55) &&
              (@.dimension[0].number() == 662))  */  FALSE ON ERROR TYPE(LAX) )=1)
   4 - access(JSON_QUERY("C2" /*+ LOB_BY_VALUE */  FORMAT OSON , '$.products[*].id' RETURNING NUMBER ASIS
              WITHOUT ARRAY WRAPPER ERROR ON ERROR NULL ON EMPTY NULL ON MISMATCH TYPE(LAX)  MULTIVALUE)=55 AND
              JSON_QUERY("C2" /*+ LOB_BY_VALUE */  FORMAT OSON , '$.products[*].dimension[*]' RETURNING NUMBER ASIS
              WITHOUT ARRAY WRAPPER ERROR ON ERROR NULL ON EMPTY NULL ON MISMATCH TYPE(LAX)  MULTIVALUE)=662)
 
Result Cache Information (identified by operation id):
------------------------------------------------------
 
   1 - column-count=2; dependencies=(DEMO.MY_DEMO); name="select * from my_demo where json_exists( c2, '$.products[*]?(@.id.number() == 55
        && @.dimension[0].number() == 662 )' ) "
 
 
Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1 because of no expensive parallel operation
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         10  consistent gets
          0  physical reads
          0  redo size
       1239  bytes sent via SQL*Net to client
        405  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1    rows processed
 
 
In the above query, Oracle first probes the multi-value index using the path expression filter:
$.products[*]?(@.id.number() == 55)
 
This predicate is used to identify all rowids where at least one product has an id value of 55. For each matching rowid returned from the index, Oracle then accesses the base table and applies the additional predicate:
 
@.dimension[0].number() == 662
 
Only the rows that satisfy this second condition are finally returned to the client. Rows that fail this predicate are discarded during the table access phase. This behavior is clearly visible in the execution plan, where the predicates appear separately under the ACCESS and FILTER sections.
 
The performance impact of this execution strategy depends on selectivity:

  • If many rows match the first predicate (product id = 55), a large number of rowids will be fetched from the index and used to probe the table.
  • If only a few of those rows satisfy the second predicate (dimension[0] = 662), most of the table lookups will be wasted work and will be discarded during filtering. 
This results in unnecessary I/O and increased CPU usage.
 
For queries that target a specific array position, the optimal solution is to include the ordinality pseudo-column in the multi-value index definition. By indexing the array position along with the value, Oracle can fully resolve both predicates at the index level and avoid redundant table probes — delivering true index-driven execution for position-based JSON queries.
 
To make queries on specific array positions fully indexable, we can include the ordinality pseudocolumn
.
 
demo@ADB26ai> drop index my_demo_idx;
 
Index dropped.
 
demo@ADB26ai>
demo@ADB26ai> create multivalue index my_demo_idx on my_demo(
  2      json_table( c2 ,'$.products[*]'
  3          error on error null on empty null on mismatch
  4          columns ( object_id number path '$.id' ,
  5                     nested path dimension[*]
  6                     columns( dim number path '$'
  7                             , idx for ordinality )
  8             )
  9      )
 10  );
 
Index created.
 
This adds the array position (idx) into the index.
 
demo@ADB26ai> select * from my_demo where json_exists( c2, '$.products[*]?(@.id.number() == 55
  2     && @.dimension[0].number() == 662 )' ) ;
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3662244578
 
-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                            |     1 |   728 |     2   (0)| 00:00:01 |
|   1 |  RESULT CACHE                        | 3kjsv3q8tq4sy40thmuuv5n384 |     1 |   728 |     2   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| MY_DEMO                    |     1 |   728 |     2   (0)| 00:00:01 |
|   3 |    SORT CLUSTER BY ROWID BATCHED     |                            |     1 |       |     2   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN (MULTI VALUE)   | MY_DEMO_IDX                |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access(JSON_QUERY("C2" /*+ LOB_BY_VALUE */  FORMAT OSON , '$.products[*].id' RETURNING NUMBER ASIS
              WITHOUT ARRAY WRAPPER ERROR ON ERROR NULL ON EMPTY NULL ON MISMATCH TYPE(LAX)  MULTIVALUE)=55 AND
              JSON_QUERY("C2" /*+ LOB_BY_VALUE */  FORMAT OSON , '$.products[*].dimension[*]' RETURNING NUMBER ASIS
              WITHOUT ARRAY WRAPPER ERROR ON ERROR NULL ON EMPTY NULL ON MISMATCH TYPE(LAX)  MULTIVALUE)=662 AND
              JSON_QUERY("C2" /*+ LOB_BY_VALUE */  FORMAT OSON , '$.products[*].dimension[*].position' RETURNING NUMBER
              ASIS  WITHOUT ARRAY WRAPPER ERROR ON ERROR NULL ON EMPTY NULL ON MISMATCH TYPE(LAX)  MULTIVALUE)=0)
 
Result Cache Information (identified by operation id):
------------------------------------------------------
 
   1 - column-count=2; dependencies=(DEMO.MY_DEMO); name="select * from my_demo where json_exists( c2, '$.products[*]?(@.id.number() == 55
        && @.dimension[0].number() == 662 )' ) "
 
 
Note
-----
   - automatic DOP: Computed Degree of Parallelism is 1 because of no expensive parallel operation
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         10  consistent gets
          0  physical reads
          0  redo size
       1239  bytes sent via SQL*Net to client
        405  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1    rows processed
 
 
Now everything is handled by the index: 
  • product_id = 55
  • dimension = 662
  • position = 0 
There is no separate FILTER step — only indexed access. This makes even position-based JSON array queries fully scalable.
 
Multi-value indexes represent a major leap forward in Oracle’s JSON indexing architecture.

They provide:
  • Native indexing of JSON arrays
  • Direct index access without flattening
  • Support for nested sibling arrays
  • Efficient position-based filtering with ordinality
  • Minimal storage and maintenance overhead
This makes multi-value indexing the best-practice approach for high-performance JSON workloads in Oracle Database 21c and above.
 
In the next blog post, we’ll explore how Oracle’s native JSON storage formats (OSON and binary JSON) further enhance query performance and indexing efficiency.
 
Stay tuned!

 

No comments:

Post a Comment