One of many nice things about Oracle
database 21c is “JSON” datatype, which is an optimized native binary storage
format using the OSON format. All json documents in autonomous database are
automatically stored in OSON format. This delivers big performance benefits to
json applications, including faster query performance, efficient updates, multi-value
function index and efficient storage sizes. This native json format is
completely transparent to applications and applications always use standard
json documents, but all the database operations on json documents are optimized
under the covers via binary format.
OSON is based on tree encoding and a
simple json documents are serialized into OSON byte array having tree pointers
for easy navigation offsets.
Let’s compare the text storage of json
documents with json datatypes
demo@ATP21C> create table
test_json(x number,y json);
Table created.
demo@ATP21C> create table
test_text(x number,y clob check(y is json format json) );
Table created.
Now the table “big_table” got loaded with
some 100K rows(copy of all_objects) and the average size of json document sized
620 bytes in size.
demo@ATP21C> select
round(avg(length(json_object(*))),2) size_bytes
from big_table;
SIZE_BYTES
----------
622.89
To make it big, let’s aggregate that into
a bigger json document using SQL/JSON function’s and now it has grown into a
bigger size of 59.5 MB
demo@ATP21C> select
round(length(json_arrayagg(json_object(*) returning blob)),2) size_bytes from
big_table;
SIZE_BYTES
----------
62388939
When we inserted that bigger json
document into our test table
demo@ATP21C> insert into
test_json(x,y)
2 select 1,json_arrayagg(json_object(*) returning blob)
3 from big_table;
1 row created.
demo@ATP21C> insert into
test_text(x,y)
2 select x, json_serialize(y returning clob) from test_json;
1 row created.
Now the size of large json document from
both tables are
demo@ATP21C> select round(
length(y)/1024/1024,2) size_mb from test_text;
SIZE_MB
----------
59.5
demo@ATP21C> select round(
length(y)/1024/1024,2) size_mb from test_json;
SIZE_MB
----------
27.63
Pretty much the json datatype size half
the storage of text-based storage for json documents. The reason for this half
the storage for new json datatype is that the json document it got many repeated
json objects as array, so each “key” got repeated within the array and these
repeated “keys” with the documents are replaced with ID’s and result in
efficient reduction in space.
Now let’s see the effect of JSON datatype
on many small documents instead of one large document.
Let’s reload the above two tables as every row as json document into each table.
demo@ATP21C> truncate
table test_json;
Table truncated.
demo@ATP21C> truncate
table test_text;
Table truncated.
demo@ATP21C> insert all
2 into test_json(x,y) values(r,j)
3 into test_text(x,y) values(r,j)
4 select rownum r, json_object(*) j
5 from big_table;
200000 rows created.
demo@ATP21C> select round(
sum(length(y))/1024/1024,2) size_mb from test_text;
SIZE_MB
----------
59.4
demo@ATP21C> select round(
sum(length(y))/1024/1024,2) size_mb from test_json;
SIZE_MB
----------
64.63
demo@ATP21C>
now the json datatype got little bit bigger,
because each json document persisted in that datatype is self-contained and we
got no repeated “keys” there to benefit the compression.
----------
622.89
----------
62388939
2 select 1,json_arrayagg(json_object(*) returning blob)
3 from big_table;
2 select x, json_serialize(y returning clob) from test_json;
----------
59.5
----------
27.63
Let’s reload the above two tables as every row as json document into each table.
2 into test_json(x,y) values(r,j)
3 into test_text(x,y) values(r,j)
4 select rownum r, json_object(*) j
5 from big_table;
----------
59.4
----------
64.63
No comments:
Post a Comment