THE RAJESH BLOG: JSON Search Index - Part II (SEARCH

In the previous post, we explored how a JSON search index created with SEARCH_ON = TEXT can significantly improve performance for full-text searches. It works very well with operators like JSON_TEXTCONTAINS and is particularly useful when most queries involve searching textual content inside JSON documents. However, one major limitation quickly becomes apparent: TEXT-based search indexes are not effective for range-based predicates or numeric filtering.

This is where the SEARCH_ON = VALUE option becomes important. While TEXT indexes focus only on textual content, VALUE-based search indexes are designed to support structured filtering, including equality and range predicates on numeric and string values. In other words, if your JSON queries involve comparisons such as greater than, less than, or between conditions, VALUE-based indexing becomes far more useful.

Creating a VALUE-based search index is straightforward:

demo@ADB26AI> create search index my_demo_idx
2 on my_demo(c2)
3 for json parameters(' search_on VALUE ');

Index created.

Once created, one of the first things you may notice is the size of the index. VALUE-based indexes are significantly larger than TEXT-based indexes. In our case, the resulting index size is approximately 1.9 GB:

demo@ADB26AI> select round(sum(bytes)/1024/1024,2) size_mb
2 from user_segments
3 where segment_name like '%MY_DEMO_IDX%';

   SIZE_MB
----------
   1916.75

By comparison, the TEXT-based index on the same data occupies only about 893 MB.

demo@ADB26ai> create search index my_demo_idx
2 on my_demo(c2)
3 for json parameters(' search_on TEXT ');

Index created.

demo@ADB26ai>
demo@ADB26ai> select round(sum(bytes)/1024/1024,2) size_mb
2 from user_segments
3 where segment_name like '%MY_DEMO_IDX%';

   SIZE_MB
----------
    893.25

The reason for this difference is simple. A TEXT-based index stores only textual content extracted from JSON documents, whereas a VALUE-based index stores multiple datatypes including numbers, timestamps, and character values. Because more data types are indexed, the index footprint naturally becomes larger.

With a VALUE-based index in place, attempting to perform a full-text search using JSON_TEXTCONTAINS results in an error:

demo@ADB26AI> select *
2 from my_demo
3 where json_textcontains( c2, '$.products.str','ljfycrixvcehvmpmldubibaur') ;
from my_demo
     *
ERROR at line 2:
ORA-40467: JSON_TEXTCONTAINS() cannot be evaluated without a JSON-enabled context index
Help: https://docs.oracle.com/error-help/db/ora-40467/

This fails because VALUE-based indexes do not support full-text search. They are not JSON-enabled context indexes, and therefore cannot be used by JSON_TEXTCONTAINS. This highlights an important design choice: VALUE indexes are built for structured comparisons, not keyword-style searches.

Where VALUE-based indexing shines is in range-based filtering. Consider a query that filters values within a range:

demo@ADB26AI> select *
2 from my_demo nested c2.products[*]
3      columns ( x1 varchar2(40) path '$.str')
4 where x1 between 'a' and 'b';

Execution Plan
----------------------------------------------------------
Plan hash value: 3720428357

-----------------------------------------------------------------------------------------------
| Id | Operation                       | Name        | Rows | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |             |    31M|    21G| 1268K (19)| 00:00:50 |
|   1 | NESTED LOOPS                   |             |    31M|    21G| 1268K (19)| 00:00:50 |
|   2 |   TABLE ACCESS BY INDEX ROWID   | MY_DEMO     | 2500 | 1777K|   488   (0)| 00:00:01 |
|   3 |    SORT CLUSTER BY ROWID BATCHED|             |       |       |     4   (0)| 00:00:01 |
|* 4 |     DOMAIN INDEX                | MY_DEMO_IDX |       |       |     4   (0)| 00:00:01 |
|* 5 |   JSONTABLE EVALUATION          |             | 12500 | 25000 |   507 (18)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2" /*+ LOB_BY_VALUE */
              ,'((sdata(FVCH_073E8889C10DE4900FDC7FB51B882966_str >= "a" )) and
              (sdata(FVCH_073E8889C10DE4900FDC7FB51B882966_str <= "b" )))')>0)
   5 - filter("P"."X1">='a' AND "P"."X1"<='b')

When examining the execution plan, we see that the search index is used effectively. The access predicate shows that both the JSON path and the range values are used to navigate the index. Oracle constructs an internal representation of the range condition and uses it directly within the index lookup. This allows Oracle to retrieve only the rowids that satisfy the range condition, which are then used to fetch the corresponding rows from the table. Unlike the TEXT index scenario, the filtering is not deferred to the table access stage; it is performed as part of the index navigation itself.

The same behavior can be observed when using JSON_EXISTS with numeric range conditions:

demo@ADB26AI> select *
2 from my_demo
3 where json_exists( c2 ,'$.products[*]?(@.id >= 5)');

Execution Plan
----------------------------------------------------------
Plan hash value: 2005405343

----------------------------------------------------------------------------------------------
| Id | Operation                      | Name        | Rows | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |             |    25 | 18200 |   488   (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID   | MY_DEMO     |    25 | 18200 |   488   (0)| 00:00:01 |
|   2 |   SORT CLUSTER BY ROWID BATCHED|             |       |       |     4   (0)| 00:00:01 |
|* 3 |    DOMAIN INDEX                | MY_DEMO_IDX |       |       |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(JSON_EXISTS2("C2" /*+ LOB_BY_VALUE */ FORMAT OSON ,
              '$.products[*]?(@.id >= 5)' /* json_path_str $.products[*]?(@.id.number() >= 5) */
              FALSE ON ERROR TYPE(LAX) )=1)
   3 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2" /*+ LOB_BY_VALUE */
              ,'(sdata(FNUM_6AA7829B12A1ECA7231B4272D111B8B4_id >= 5 ))')>0)

Here again, the predicate section of the execution plan shows that Oracle navigates the index using both the path expression and the comparison value. This ensures that only relevant rows are returned from the index, making the operation efficient even for large datasets.

This behavior remains consistent when queries are written using JSON_TABLE or the SQL nested clause.

demo@ADB26AI> select *
2 from my_demo nested c2.products[*]
3      columns ( x1 number path '$.id')
4 where x1 between 5 and 12;

Execution Plan
----------------------------------------------------------
Plan hash value: 3720428357

-----------------------------------------------------------------------------------------------
| Id | Operation                       | Name        | Rows | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |             |    31M|    21G| 1268K (19)| 00:00:50 |
|   1 | NESTED LOOPS                   |             |    31M|    21G| 1268K (19)| 00:00:50 |
|   2 |   TABLE ACCESS BY INDEX ROWID   | MY_DEMO     | 2500 | 1777K|   488   (0)| 00:00:01 |
|   3 |    SORT CLUSTER BY ROWID BATCHED|             |       |       |     4   (0)| 00:00:01 |
|* 4 |     DOMAIN INDEX                | MY_DEMO_IDX |       |       |     4   (0)| 00:00:01 |
|* 5 |   JSONTABLE EVALUATION          |             | 12500 | 25000 |   507 (18)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2" /*+ LOB_BY_VALUE */
              ,'((sdata(FNUM_6AA7829B12A1ECA7231B4272D111B8B4_id >= 5 )) and
              (sdata(FNUM_6AA7829B12A1ECA7231B4272D111B8B4_id <= 12 )))')>0)
   5 - filter("P"."X1">=5 AND "P"."X1"<=12)

demo@ADB26AI> select *
2 from my_demo , json_table( c2, '$.products[*]'
3     columns( x1 number path '$.id' ) ) jt
4 where jt.x1 between 5 and 12;

Execution Plan
----------------------------------------------------------
Plan hash value: 3720428357

-----------------------------------------------------------------------------------------------
| Id | Operation                       | Name        | Rows | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |             |    31M|    21G| 1268K (19)| 00:00:50 |
|   1 | NESTED LOOPS                   |             |    31M|    21G| 1268K (19)| 00:00:50 |
|   2 |   TABLE ACCESS BY INDEX ROWID   | MY_DEMO     | 2500 | 1777K|   488   (0)| 00:00:01 |
|   3 |    SORT CLUSTER BY ROWID BATCHED|             |       |       |     4   (0)| 00:00:01 |
|* 4 |     DOMAIN INDEX                | MY_DEMO_IDX |       |       |     4   (0)| 00:00:01 |
|* 5 |   JSONTABLE EVALUATION          |             | 12500 | 25000 |   507 (18)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2" /*+ LOB_BY_VALUE */
              ,'((sdata(FNUM_6AA7829B12A1ECA7231B4272D111B8B4_id >= 5 )) and
              (sdata(FNUM_6AA7829B12A1ECA7231B4272D111B8B4_id <= 12 )))')>0)
   5 - filter("P"."X1">=5 AND "P"."X1"<=12)

Whether the JSON values are projected into relational columns or accessed directly through JSON operators, Oracle is able to use the VALUE-based search index to evaluate range predicates efficiently. In each case, the index is navigated using both the path expression and the range values, confirming that the filtering logic is pushed into the index itself.

Although VALUE-based indexing is powerful, its default configuration indexes multiple datatypes and can therefore consume considerable storage. Oracle provides a useful variant that allows indexing only specific datatypes. For example, if the workload primarily involves numeric filtering, it is possible to create an index that targets only numeric values:

demo@ADB26AI> drop index if exists my_demo_idx;

Index dropped.

demo@ADB26AI> create search index my_demo_idx
2 on my_demo(c2)
3 for json parameters(' search_on VALUE(NUMBER) ');

Index created.

This significantly reduces the index size. In our case, the index size drops to around 836 MB, which is almost half the size of the default VALUE-based index.

demo@ADB26AI> select round(sum(bytes)/1024/1024,2) size_mb
2 from user_segments
3 where segment_name like '%MY_DEMO_IDX%';

   SIZE_MB
----------
    836.81

By indexing only numeric values, the index becomes more focused and efficient for numeric searches while avoiding unnecessary storage overhead.

With this numeric-only VALUE index in place, equality-based searches perform efficiently:

demo@ADB26AI> select *
2 from my_demo
3 where json_exists( c2 ,'$.products[*]?(@.id.number() == 5)');

Execution Plan
----------------------------------------------------------
Plan hash value: 2005405343

----------------------------------------------------------------------------------------------
| Id | Operation                      | Name        | Rows | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |             |    25 | 18200 |   488   (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID   | MY_DEMO     |    25 | 18200 |   488   (0)| 00:00:01 |
|   2 |   SORT CLUSTER BY ROWID BATCHED|             |       |       |     4   (0)| 00:00:01 |
|* 3 |    DOMAIN INDEX                | MY_DEMO_IDX |       |       |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(JSON_EXISTS2("C2" /*+ LOB_BY_VALUE */ FORMAT OSON ,
              '$.products[*]?(@.id.number() == 5)' /* json_path_str $.products[*]?(@.id.number()
              == 5) */ FALSE ON ERROR TYPE(LAX) )=1)
   3 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2" /*+ LOB_BY_VALUE */
              ,'(sdata(FNUM_6AA7829B12A1ECA7231B4272D111B8B4_id = 5 ))')>0)

The execution plan confirms that Oracle uses both the path and the numeric value to navigate the index. The same efficiency applies to range-based numeric searches such as:

demo@ADB26AI> select *
2 from my_demo nested c2.products[*]
3      columns ( x1 number path '$.id')
4 where x1 between 5 and 12;

Execution Plan
----------------------------------------------------------
Plan hash value: 3720428357

-----------------------------------------------------------------------------------------------
| Id | Operation                       | Name        | Rows | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |             |    31M|    21G| 1268K (19)| 00:00:50 |
|   1 | NESTED LOOPS                   |             |    31M|    21G| 1268K (19)| 00:00:50 |
|   2 |   TABLE ACCESS BY INDEX ROWID   | MY_DEMO     | 2500 | 1777K|   488   (0)| 00:00:01 |
|   3 |    SORT CLUSTER BY ROWID BATCHED|             |       |       |     4   (0)| 00:00:01 |
|* 4 |     DOMAIN INDEX                | MY_DEMO_IDX |       |       |     4   (0)| 00:00:01 |
|* 5 |   JSONTABLE EVALUATION          |             | 12500 | 25000 |   507 (18)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2" /*+ LOB_BY_VALUE */
              ,'((sdata(FNUM_6AA7829B12A1ECA7231B4272D111B8B4_id >= 5 )) and
              (sdata(FNUM_6AA7829B12A1ECA7231B4272D111B8B4_id <= 12 )))')>0)
   5 - filter("P"."X1">=5 AND "P"."X1"<=12)

In each case, the index lookup incorporates both the path expression and the numeric comparison values, allowing Oracle to return only the relevant rowids.

However, it is important to understand the trade-off. Since this variant indexes only numeric datatypes, textual searches will not benefit from value-based filtering. If we attempt a text equality search on a string field, the execution plan may still show index usage,

demo@ADB26AI> select *
2 from my_demo nested c2.products[*]
3      columns ( x1 varchar2(40) path '$.str')
4 where x1 = 'ljfycrixvcehvmpmldubibaur';

Execution Plan
----------------------------------------------------------
Plan hash value: 3720428357

-----------------------------------------------------------------------------------------------
| Id | Operation                       | Name        | Rows | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |             |   125M|    84G| 1261K (18)| 00:00:50 |
|   1 | NESTED LOOPS                   |             |   125M|    84G| 1261K (18)| 00:00:50 |
|   2 |   TABLE ACCESS BY INDEX ROWID   | MY_DEMO     | 2500 | 1777K|   488   (0)| 00:00:01 |
|   3 |    SORT CLUSTER BY ROWID BATCHED|             |       |       |     4   (0)| 00:00:01 |
|* 4 |     DOMAIN INDEX                | MY_DEMO_IDX |       |       |     4   (0)| 00:00:01 |
|* 5 |   JSONTABLE EVALUATION          |             | 50000 |    97K|   505 (18)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2" /*+ LOB_BY_VALUE */
              ,'(HASPATH(/products/str))')>0)
   5 - filter("P"."X1"='ljfycrixvcehvmpmldubibaur')

but the access predicate reveals that Oracle navigates the index using only the path expression and not the actual text value. This means Oracle retrieves rows containing that path and then applies the text filter after accessing the table. While the index still contributes partially, it does not provide the same efficiency as a TEXT-based search index for textual searches.

This brings us to the key design consideration. Choosing between TEXT and VALUE search indexes depends entirely on the query workload. TEXT-based indexes are ideal for keyword-style searches and textual filtering, whereas VALUE-based indexes are optimized for structured comparisons, numeric filtering, and range predicates. In environments where both types of queries are common, understanding these trade-offs becomes essential for designing an effective indexing strategy.

In the next post of this series, we will explore combined search index configurations and discuss how to balance full-text and structured filtering requirements in real-world JSON workloads.

Tuesday, February 17, 2026

JSON Search Index - Part II (SEARCH_ON = VALUE)

No comments:

Post a Comment