Wednesday, February 25, 2026

JSON Search Index - Part III (SEARCH_ON = TEXT_VALUE)


When we explored JSON search indexes earlier, we looked at how the SEARCH_ON parameter can be set to either TEXT or VALUE. Using SEARCH_ON=TEXT enables efficient full-text searches, while SEARCH_ON=VALUE helps with structured filtering such as equality and range predicates across supported datatypes. However, there has always been a practical limitation: it was not possible to efficiently support both full-text search and range-based filtering using a single search index configuration.
 
This is where the TEXT_VALUE option comes into play.
 
The TEXT_VALUE parameter brings both capabilities under a single umbrella. With this configuration, the same JSON search index can support full-text search as well as range-based filtering across supported datatypes. For workloads that combine keyword searches with structured comparisons, this becomes a powerful and practical indexing strategy.
 
To create a search index that supports both full-text and range-based search, we can define the index using the TEXT_VALUE parameter:
 
demo@ADB26AI> create search index my_demo_idx
  2  on my_demo(c2)
  3  for json parameters(' search_on TEXT_VALUE ');
 
Index created.
 
Once created, one of the first observations is the index size. In this example, the resulting index size is approximately 1.4 GB:
 
demo@ADB26AI> select round(sum(bytes)/1024/1024,2) size_mb
  2  from user_segments
  3  where segment_name like '%MY_DEMO_IDX%';
 
   SIZE_MB
----------
   1395.06
 
This size sits between the typical footprint of TEXT-only and VALUE-only indexes because the index now stores both textual and structured value information required to support multiple query patterns.
 
With this index in place, let us first look at numeric predicates. When executing equality or range-based numeric searches using JSON_EXISTS,

 
demo@ADB26AI> select *
  2  from my_demo
  3  where json_exists( c2 ,'$.products[*]?(@.id.number() == 42)');
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2005405343
 
----------------------------------------------------------------------------------------------
| Id  | Operation                      | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |             |    25 | 18200 |   488   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID   | MY_DEMO     |    25 | 18200 |   488   (0)| 00:00:01 |
|   2 |   SORT CLUSTER BY ROWID BATCHED|             |       |       |     4   (0)| 00:00:01 |
|*  3 |    DOMAIN INDEX                | MY_DEMO_IDX |       |       |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(JSON_EXISTS2("C2" /*+ LOB_BY_VALUE */  FORMAT OSON ,
              '$.products[*]?(@.id.number() == 42)' /* json_path_str  $.products[*]?(@.id.number()
              == 42)  */  FALSE ON ERROR TYPE(LAX) )=1)
   3 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2" /*+ LOB_BY_VALUE */
              ,'(sdata(FNUM_6AA7829B12A1ECA7231B4272D111B8B4_id  = 42 ))')>0)
 
 
the execution plan clearly shows that the JSON search index is used effectively. Both the path expression and the numeric comparison value are pushed into the index lookup. The index returns only the matching rowids, which are then used to access the table rows. This ensures efficient filtering even for large datasets.
 
The same efficient behavior can be observed for range-based numeric conditions. Whether using equality comparisons, greater-than conditions, or BETWEEN predicates, the optimizer is able to use both the path expression and the numeric values to navigate the index. This confirms that filtering is performed inside the index rather than being deferred to the table access step.
 
demo@ADB26AI> select *
  2  from my_demo
  3  where json_exists( c2 ,'$.products[*]?(@.id.number() >= 5)');
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2005405343
 
----------------------------------------------------------------------------------------------
| Id  | Operation                      | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |             |    25 | 18200 |   488   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID   | MY_DEMO     |    25 | 18200 |   488   (0)| 00:00:01 |
|   2 |   SORT CLUSTER BY ROWID BATCHED|             |       |       |     4   (0)| 00:00:01 |
|*  3 |    DOMAIN INDEX                | MY_DEMO_IDX |       |       |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(JSON_EXISTS2("C2" /*+ LOB_BY_VALUE */  FORMAT OSON ,
              '$.products[*]?(@.id.number() >= 5)' /* json_path_str  $.products[*]?(@.id.number()
              >= 5)  */  FALSE ON ERROR TYPE(LAX) )=1)
   3 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2" /*+ LOB_BY_VALUE */
              ,'(sdata(FNUM_6AA7829B12A1ECA7231B4272D111B8B4_id  >= 5 ))')>0)
 
 
demo@ADB26AI> select *
  2  from my_demo nested c2.products[*]
  3      columns ( x1 number path '$.id')
  4  where x1 between 5 and 12;
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3720428357
 
-----------------------------------------------------------------------------------------------
| Id  | Operation                       | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |             |    31M|    21G|  1268K (19)| 00:00:50 |
|   1 |  NESTED LOOPS                   |             |    31M|    21G|  1268K (19)| 00:00:50 |
|   2 |   TABLE ACCESS BY INDEX ROWID   | MY_DEMO     |  2500 |  1777K|   488   (0)| 00:00:01 |
|   3 |    SORT CLUSTER BY ROWID BATCHED|             |       |       |     4   (0)| 00:00:01 |
|*  4 |     DOMAIN INDEX                | MY_DEMO_IDX |       |       |     4   (0)| 00:00:01 |
|*  5 |   JSONTABLE EVALUATION          |             | 12500 | 25000 |   507  (18)| 00:00:01 |
-----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2" /*+ LOB_BY_VALUE */
              ,'((sdata(FNUM_6AA7829B12A1ECA7231B4272D111B8B4_id  >= 5 )) and
              (sdata(FNUM_6AA7829B12A1ECA7231B4272D111B8B4_id  <= 12 )))')>0)
   5 - filter("P"."X1">=5 AND "P"."X1"<=12)
 
 
Even when queries are written using JSON_TABLE or nested projections, the behavior remains consistent. The index is probed using both the JSON path and the numeric range values, returning only the relevant rowids. This demonstrates that the TEXT_VALUE index effectively supports structured numeric filtering alongside full-text capabilities.
 
Next, consider full-text search scenarios. When running a JSON_TEXTCONTAINS query on a string attribute, the execution plan shows that the search index is used directly.

 
demo@ADB26AI> select *
  2  from my_demo
  3  where json_textcontains( c2, '$.products.str','ljfycrixvcehvmpmldubibaur') ;
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2005405343
 
----------------------------------------------------------------------------------------------
| Id  | Operation                      | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |             |  2500 |  1777K|   488   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID   | MY_DEMO     |  2500 |  1777K|   488   (0)| 00:00:01 |
|   2 |   SORT CLUSTER BY ROWID BATCHED|             |       |       |     4   (0)| 00:00:01 |
|*  3 |    DOMAIN INDEX                | MY_DEMO_IDX |       |       |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2" /*+ LOB_BY_VALUE */
              ,'(ljfycrixvcehvmpmldubibaur) INPATH (/products/str)')>0)
 
In this case, both the path expression and the search string are used to probe the index, and only matching rowids are returned. This confirms that the TEXT_VALUE index fully supports full-text search requirements in addition to numeric filtering.
 
However, an interesting limitation appears when we move into string-based range predicates. Suppose we perform a query that filters string values using a BETWEEN condition on a VARCHAR2 attribute.

 
demo@ADB26AI> select *
  2  from my_demo nested c2.products[*]
  3      columns ( x1 varchar2(40) path '$.str')
  4  where x1 between 'a' and 'b';
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3720428357
 
-----------------------------------------------------------------------------------------------
| Id  | Operation                       | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |             |    31M|    21G|  1268K (19)| 00:00:50 |
|   1 |  NESTED LOOPS                   |             |    31M|    21G|  1268K (19)| 00:00:50 |
|   2 |   TABLE ACCESS BY INDEX ROWID   | MY_DEMO     |  2500 |  1777K|   488   (0)| 00:00:01 |
|   3 |    SORT CLUSTER BY ROWID BATCHED|             |       |       |     4   (0)| 00:00:01 |
|*  4 |     DOMAIN INDEX                | MY_DEMO_IDX |       |       |     4   (0)| 00:00:01 |
|*  5 |   JSONTABLE EVALUATION          |             | 12500 | 25000 |   507  (18)| 00:00:01 |
-----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2" /*+ LOB_BY_VALUE */
              ,'((HASPATH(/products/str)) and (HASPATH(/products/str)))')>0)
   5 - filter("P"."X1">='a' AND "P"."X1"<='b')
 
 
At first glance, the execution plan still shows index usage. But a closer look at the access predicates reveals that the index is navigated only using the path expression, not the actual string range values. As a result, the index returns all rowids that match the path expression, and the range filter on the string values is applied later during table access.

This means that while the index still contributes to the query, it is not as efficient as a scenario where both the path and range values are used directly during index navigation. The filtering on string range values is effectively pushed to a later stage, which can reduce performance benefits for large datasets.
 
This behavior highlights an important nuance of the default TEXT_VALUE configuration. By default, it enables full-text search and range-based search for NUMBER and TIMESTAMP datatypes, but not for VARCHAR2 range predicates. If the workload includes range-based filtering on string values, we must explicitly include VARCHAR2 as a supported datatype during index creation.
 
To enable this, the index can be recreated with explicit datatype inclusion:
 
demo@ADB26AI> drop index if exists my_demo_idx ;
 
Index dropped.
 
demo@ADB26AI> create search index my_demo_idx
  2  on my_demo(c2)
  3  for json parameters(' search_on TEXT_VALUE (NUMBER,VARCHAR2) ');
 
Index created.
 
demo@ADB26AI> select round(sum(bytes)/1024/1024,2) size_mb
  2  from user_segments
  3  where segment_name like '%MY_DEMO_IDX%';
 
   SIZE_MB
----------
   2477.06
 
With this configuration, the index now supports full-text search as well as range-based filtering for both numeric and string datatypes. Naturally, this broader coverage comes with an increase in index size. In this example, the index size grows to roughly 2.4 GB, reflecting the additional datatype indexing.
 
Once this enhanced index is in place, both full-text searches and string-based range predicates benefit from efficient index navigation.

 
demo@ADB26AI> select *
  2  from my_demo
  3  where json_textcontains( c2, '$.products.str','ljfycrixvcehvmpmldubibaur') ;
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2005405343
 
----------------------------------------------------------------------------------------------
| Id  | Operation                      | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |             |  2500 |  1777K|   488   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID   | MY_DEMO     |  2500 |  1777K|   488   (0)| 00:00:01 |
|   2 |   SORT CLUSTER BY ROWID BATCHED|             |       |       |     4   (0)| 00:00:01 |
|*  3 |    DOMAIN INDEX                | MY_DEMO_IDX |       |       |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2" /*+ LOB_BY_VALUE */
              ,'(ljfycrixvcehvmpmldubibaur) INPATH (/products/str)')>0)
 
 
demo@ADB26AI> select *
  2  from my_demo nested c2.products[*]
  3      columns ( x1 varchar2(40) path '$.str')
  4  where x1 between 'a' and 'b';
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3720428357
 
-----------------------------------------------------------------------------------------------
| Id  | Operation                       | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |             |    31M|    21G|  1268K (19)| 00:00:50 |
|   1 |  NESTED LOOPS                   |             |    31M|    21G|  1268K (19)| 00:00:50 |
|   2 |   TABLE ACCESS BY INDEX ROWID   | MY_DEMO     |  2500 |  1777K|   488   (0)| 00:00:01 |
|   3 |    SORT CLUSTER BY ROWID BATCHED|             |       |       |     4   (0)| 00:00:01 |
|*  4 |     DOMAIN INDEX                | MY_DEMO_IDX |       |       |     4   (0)| 00:00:01 |
|*  5 |   JSONTABLE EVALUATION          |             | 12500 | 25000 |   507  (18)| 00:00:01 |
-----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2" /*+ LOB_BY_VALUE */
              ,'((sdata(FVCH_073E8889C10DE4900FDC7FB51B882966_str  >= "a" )) and
              (sdata(FVCH_073E8889C10DE4900FDC7FB51B882966_str  <= "b" )))')>0)
   5 - filter("P"."X1">='a' AND "P"."X1"<='b')
 
 
Execution plans confirm that the optimizer now uses both the path expression and the range predicate values to probe the index directly. Only the matching rowids are returned from the index, significantly improving filtering efficiency compared to the default configuration
 
The key takeaway is that TEXT_VALUE offers a flexible and unified indexing approach for mixed workloads that require both keyword-style searches and structured filtering. However, understanding its default datatype behavior is important. If string-based range predicates are part of the workload, explicitly including VARCHAR2 during index creation ensures that these queries benefit fully from index-level filtering.
 
Choosing the right configuration ultimately depends on query patterns and storage considerations. While broader datatype support increases index size, it can dramatically improve performance for combined search workloads. The TEXT_VALUE parameter therefore provides a powerful option for designing JSON search indexes that balance flexibility, performance, and storage in modern Oracle workloads.