In one of the previous blogs, we discussed how a JSON Search Index
can be configured to support either full-text
search or range-based
search across supported datatypes such as NUMBER, TIMESTAMP, and VARCHAR2.
We also saw that with the TEXT_VALUE
option, we could combine full-text search with range-based predicates for
numeric and timestamp datatypes by default, and optionally include string
datatypes when explicitly specified.
Now let us move into a fourth flavor of JSON search index
configuration — one that is slightly more aggressive than the earlier variants
we explored. This option is designed to support full-text search and
range-based search for all supported datatypes (numeric, string, and timestamp)
by default. The trade-off, however, is increased storage consumption compared
to the other three configurations.
This configuration is enabled using:
demo@ADB26AI> create
search index my_demo_idx
2 on
my_demo(c2)
3 for
json parameters(' search_on TEXT_VALUE_STRING
');
Index created.
After creating
the index, we can check its size:
demo@ADB26AI> select
round(sum(bytes)/1024/1024,2) size_mb
2 from
user_segments
3
where segment_name like '%MY_DEMO_IDX%';
SIZE_MB
----------
2485.06
In this case, the index size is approximately 2.4 GB. This is
noticeably larger than the earlier configurations, but the benefit is clear:
the index is now capable of handling a wide variety of workloads across all
supported datatypes without requiring additional datatype-specific
configuration.
Let us look at how this behaves in practice
When executing a numeric equality predicate using JSON_EXISTS,
the execution plan clearly shows that both the JSON path and the numeric value
are pushed into the search index. The index performs the navigation internally
and returns only the matching rowids. These rowids are then used to access the
table rows. This ensures that filtering happens inside the index rather than at
the table level, improving efficiency.
demo@ADB26AI> select *
2 from
my_demo
3 where
json_exists( c2 ,'$.products[*]?(@.id.number() == 42)');
Execution Plan
----------------------------------------------------------
Plan hash value: 2005405343
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows
| Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 25 | 18200 | 488
(0)| 00:00:01 |
|* 1 |
TABLE ACCESS BY INDEX ROWID |
MY_DEMO | 25 | 18200 | 488
(0)| 00:00:01 |
| 2 |
SORT CLUSTER BY ROWID BATCHED| | |
| 4 (0)| 00:00:01 |
|* 3 |
DOMAIN INDEX | MY_DEMO_IDX | |
| 4 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
1 - filter(JSON_EXISTS2("C2" /*+
LOB_BY_VALUE */ FORMAT OSON ,
'$.products[*]?(@.id.number() ==
42)' /* json_path_str
$.products[*]?(@.id.number()
== 42) */
FALSE ON ERROR TYPE(LAX) )=1)
3 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2"
/*+ LOB_BY_VALUE */
,'(sdata(FNUM_6AA7829B12A1ECA7231B4272D111B8B4_id = 42 ))')>0)
The same behavior applies to open-ended numeric range predicates
such as >= 5. Again, both the path expression and the numeric boundary value
are used during index navigation. The index performs the filtering and returns
only relevant rowids.
demo@ADB26AI> select *
2 from
my_demo
3
where json_exists( c2 ,'$.products[*]?(@.id.number() >= 5)');
Execution Plan
----------------------------------------------------------
Plan hash value: 2005405343
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows
| Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |
25 | 18200 | 488 (0)| 00:00:01 |
|* 1 |
TABLE ACCESS BY INDEX ROWID |
MY_DEMO | 25 | 18200 | 488
(0)| 00:00:01 |
| 2 |
SORT CLUSTER BY ROWID BATCHED| | |
| 4 (0)| 00:00:01 |
|* 3 |
DOMAIN INDEX |
MY_DEMO_IDX | |
| 4 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
1 - filter(JSON_EXISTS2("C2" /*+
LOB_BY_VALUE */ FORMAT OSON ,
'$.products[*]?(@.id.number()
>= 5)' /* json_path_str
$.products[*]?(@.id.number()
>= 5) */
FALSE ON ERROR TYPE(LAX) )=1)
3 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2"
/*+ LOB_BY_VALUE */
,'(sdata(FNUM_6AA7829B12A1ECA7231B4272D111B8B4_id >= 5 ))')>0)
For two-sided range predicates, such as a BETWEEN 5 AND 12 condition,
the optimizer still pushes both lower and upper bounds into the search index.
The execution plan confirms that the index is probed with both conditions
combined, resulting in efficient rowid retrieval before table access.
demo@ADB26AI> select *
2 from
my_demo nested c2.products[*]
3
columns ( x1 number path '$.id')
4
where x1 between 5 and 12;
Execution Plan
----------------------------------------------------------
Plan hash value: 3720428357
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows
| Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 31M|
21G| 1268K (19)| 00:00:50 |
| 1 |
NESTED LOOPS
| | 31M|
21G| 1268K (19)| 00:00:50 |
| 2 |
TABLE ACCESS BY INDEX ROWID |
MY_DEMO | 2500 |
1777K| 488 (0)| 00:00:01 |
| 3 |
SORT CLUSTER BY ROWID BATCHED| | |
| 4 (0)| 00:00:01 |
|* 4 |
DOMAIN INDEX |
MY_DEMO_IDX | |
| 4 (0)| 00:00:01 |
|* 5 |
JSONTABLE EVALUATION |
| 12500 | 25000 | 507 (18)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
4 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2"
/*+ LOB_BY_VALUE */
,'((sdata(FNUM_6AA7829B12A1ECA7231B4272D111B8B4_id >= 5 )) and
(sdata(FNUM_6AA7829B12A1ECA7231B4272D111B8B4_id <= 12 )))')>0)
5 -
filter("P"."X1">=5 AND
"P"."X1"<=12)
What makes TEXT_VALUE_STRING
particularly powerful is that this same level of support extends to
string-based predicates as well.
For full-text search using JSON_TEXTCONTAINS, the execution plan shows that
the path and the search string are both used to probe the index. Only matching
rowids are returned, ensuring efficient full-text filtering.
demo@ADB26AI> select *
2 from
my_demo
3
where json_textcontains( c2, '$.products.str','ljfycrixvcehvmpmldubibaur')
;
Execution Plan
----------------------------------------------------------
Plan hash value: 2005405343
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows
| Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |
2500 | 1777K| 488
(0)| 00:00:01 |
| 1 |
TABLE ACCESS BY INDEX ROWID |
MY_DEMO | 2500 |
1777K| 488 (0)| 00:00:01 |
| 2 |
SORT CLUSTER BY ROWID BATCHED| | |
| 4 (0)| 00:00:01 |
|* 3 |
DOMAIN INDEX |
MY_DEMO_IDX | |
| 4 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
3 - access("CTXSYS"."CONTAINS"("MY_DEMO"."C2"
/*+ LOB_BY_VALUE */
,'(ljfycrixvcehvmpmldubibaur)
INPATH (/products/str)')>0)
More importantly, range-based string predicates are also fully
supported. When performing a BETWEEN 'a' AND 'b' condition on a VARCHAR2 attribute
extracted through JSON_TABLE, the execution plan confirms that both the path
and the string range boundaries are pushed into the index. Unlike earlier
configurations where only the path was used and value filtering happened later,
here the index itself evaluates the range predicate.
demo@ADB26AI> select *
2 from
my_demo nested c2.products[*]
3
columns ( x1 varchar2(40) path '$.str')
4
where x1 between 'a' and 'b';
Execution Plan
----------------------------------------------------------
Plan hash value: 3720428357
-----------------------------------------------------------------------------------------------
| Id | Operation | Name |
Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 31M|
21G| 1268K (19)| 00:00:50 |
| 1 |
NESTED LOOPS
| | 31M|
21G| 1268K (19)| 00:00:50 |
| 2 |
TABLE ACCESS BY INDEX ROWID |
MY_DEMO | 2500 |
1777K| 488 (0)| 00:00:01 |
| 3 |
SORT CLUSTER BY ROWID BATCHED| | |
| 4 (0)| 00:00:01 |
|* 4 |
DOMAIN INDEX |
MY_DEMO_IDX | |
| 4 (0)| 00:00:01 |
|* 5 |
JSONTABLE EVALUATION
| | 12500 | 25000 | 507
(18)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
4 -
access("CTXSYS"."CONTAINS"("MY_DEMO"."C2"
/*+ LOB_BY_VALUE */
,'((sdata(FVCH_073E8889C10DE4900FDC7FB51B882966_str >= "a" )) and
(sdata(FVCH_073E8889C10DE4900FDC7FB51B882966_str <= "b" )))')>0)
5 -
filter("P"."X1">='a' AND
"P"."X1"<='b')
This is the key differentiator. With TEXT_VALUE_STRING,
both full-text search and structured range predicates — for NUMBER, TIMESTAMP,
and VARCHAR2 — are fully handled at the index level. The optimizer can combine
path expressions and predicate values into the index probe, returning only the
necessary rowids.
Of course, this power comes at a cost. The index size increases
significantly because it stores additional structured information for string
datatypes to enable range evaluation. However, for mixed workloads that heavily
rely on both textual and structured filtering across all datatypes, the storage
overhead may be a worthwhile trade-off.
In summary, TEXT_VALUE_STRING
represents the most comprehensive JSON search index configuration among the
variants discussed so far. It provides maximum flexibility and performance
coverage across all supported datatypes, making it ideal for complex JSON
workloads where query patterns are diverse and unpredictable.