Thursday, July 20, 2023

JSON-Schema - Part I

JSON Schema is a declarative language that allows to annotate and validate JSON documents, it enables a confident and reliable use of the JSON data format. Benefits of JSON schema includes describes your existing data format, provides clear human and machine readable documentation & validates the data which is useful for automated testing & ensuring the quality of client submitted data. It is similar to XML schema (aka DTD)
 
JSON Schema support in Oracle database is not something new in Oracle 23c instead is it available in Oracle 19c as JSON Dataguide (introduced in 12.2.0.1 and got enhanced greatly in subsequent release) to support automatic creation of relational views. In Oracle 23c this JSON Schema is enhanced further to
 
  • validation and type casting
  • describe database object using JSON schema
  • It became base for PRECHECK constraints.
 
Let’s get started with a demo of using JSON schema into action.
 
Got a table “orders” loaded with 1000 JSON documents in it
 
demo@FREEPDB1> select count(*) from orders ;
 
  COUNT(*)
----------
     10000
 
demo@FREEPDB1> desc orders
 Name               Null?    Type
 ------------------ -------- ----------
 ORDER_ID                    NUMBER
 DATA                        JSON
 
Now we generate JSON Schema for all these rows, using json_dataguide function , that scan all these documents and provide a JSON Schema for the instance data we see.
 
demo@FREEPDB1> select json_dataguide( data, dbms_json.FORMAT_HIERARCHICAL,dbms_json.pretty ) from orders;
 
JSON_DATAGUIDE(DATA,DBMS_JSON.FORMAT_HIERARCHICAL,DBMS_JSON.PRETTY)
--------------------------------------------------------------------------------
{
  "type" : "object",
  "o:length" : 1,
  "properties" :
  {
    "User" :
    {
      "type" : "string",
      "o:length" : 8,
      "o:preferred_column_name" : "User"
    },
    "PONumber" :
    {
      "type" : "number",
      "o:length" : 4,
      "o:preferred_column_name" : "PONumber"
    },
    "LineItems" :
    {
      "type" : "array",
      "o:length" : 1,
      "o:preferred_column_name" : "LineItems",
      "items" :
      {
        "properties" :
        {
          "Part" :
          {
            "type" : "object",
            "o:length" : 1,
            "o:preferred_column_name" : "Part",
            "properties" :
            {
              "UPCCode" :
              {
                "type" : "number",
                "o:length" : 8,
                "o:preferred_column_name" : "UPCCode"
              },
              "UnitPrice" :
              {
                "type" : "number",
                "o:length" : 4,
                "o:preferred_column_name" : "UnitPrice"
              },
              "Description" :
              {
                "type" : "string",
                "o:length" : 32,
                "o:preferred_column_name" : "Description"
              }
            }
          },
          "Quantity" :
          {
            "type" : "number",
            "o:length" : 2,
            "o:preferred_column_name" : "Quantity"
          },
          "ItemNumber" :
          {
            "type" : "number",
            "o:length" : 2,
            "o:preferred_column_name" : "ItemNumber"
          }
        }
      }
    },
    "Reference" :
    {
      "type" : "string",
      "o:length" : 16,
      "o:preferred_column_name" : "Reference"
    },
    "Requestor" :
    {
      "type" : "string",
      "o:length" : 16,
      "o:preferred_column_name" : "Requestor"
    },
    "CostCenter" :
    {
      "type" : "string",
      "o:length" : 4,
      "o:preferred_column_name" : "CostCenter"
    },
    "AllowPartialShipment" :
    {
      "type" : "boolean",
      "o:length" : 8,
      "o:preferred_column_name" : "AllowPartialShipment"
    },
    "ShippingInstructions" :
    {
      "type" : "object",
      "o:length" : 1,
      "o:preferred_column_name" : "ShippingInstructions",
      "properties" :
      {
        "name" :
        {
          "type" : "string",
          "o:length" : 16,
          "o:preferred_column_name" : "name"
        },
        "Phone" :
        {
          "type" : "array",
          "o:length" : 1,
          "o:preferred_column_name" : "Phone",
          "items" :
          {
            "properties" :
            {
              "type" :
              {
                "type" : "string",
                "o:length" : 8,
                "o:preferred_column_name" : "type"
              },
              "number" :
              {
                "type" : "string",
                "o:length" : 16,
                "o:preferred_column_name" : "number"
              }
            }
          }
        },
        "Address" :
        {
          "type" : "object",
          "o:length" : 1,
          "o:preferred_column_name" : "Address",
          "properties" :
          {
            "city" :
            {
              "type" : "string",
              "o:length" : 32,
              "o:preferred_column_name" : "city"
            },
            "state" :
            {
              "type" : "string",
              "o:length" : 2,
              "o:preferred_column_name" : "state"
            },
            "street" :
            {
              "type" : "string",
              "o:length" : 32,
              "o:preferred_column_name" : "street"
            },
            "country" :
            {
              "type" : "string",
              "o:length" : 32,
              "o:preferred_column_name" : "country"
            },
            "zipCode" :
            {
              "type" : "number",
              "o:length" : 4,
              "o:preferred_column_name" : "zipCode"
            }
          }
        }
      }
    },
    "Special Instructions" :
    {
      "type" : "null",
      "o:length" : 1,
      "o:preferred_column_name" : "Special Instructions"
    }
  }
}
 
 
Lets look at the JSON schema in high level
 
{
  "type" : "object",
  "o:length" : 1,
  "properties" :
  {
    "User" :
    {
      "type" : "string",
      "o:length" : 8,
      "o:preferred_column_name" : "User"
    },
    "PONumber" :
    {
      "type" : "number",
      "o:length" : 4,
      "o:preferred_column_name" : "PONumber"
    },
 
The "type" : "object", tell us that the data we look here is an JSON object, which has the attributes like "User" and "PONumber" that are of the datatype varchar2 and number with “Preferred column name” as “User” and “PONumber” and these “Preferred column name” will be used as column name when these JSON data is exposed as relational views.
 
In short  the JSON schema produced is all about listing the fields in the documents along with its datatype.
 
In Oracle 23c we can add this JSON schema as a declarative constraint to the data model.
 
demo@FREEPDB1> alter table orders
  2  add constraint orders_json_chk
  3  check( data is json validate using q'# {
  4    "type" : "object",
  5    "o:length" : 1,
  6    "properties" :
  7    {
  8      "User" :
  9      {
 10        "type" : "string",
 11        "o:length" : 8,
 12        "o:preferred_column_name" : "User"
 13      }
……………
……………
……………
173      "Special Instructions" :
174      {
175        "type" : "null",
176        "o:length" : 1,
177        "o:preferred_column_name" : "Special Instructions"
178      }
179    }
180  } #' ) ;
 
Table altered.
 
 
Now, lets see how this affect the validation during data insertion.
 
demo@FREEPDB1> insert into orders( order_id,data) values( 10123,json('{"hello":"world"}') );
 
1 row created.
 
we expect this insert statement to fail post the JSON schema added to the table because the JSON document persisted does not map to the Keys in the JSON schema. but that was not the case here, it will succeed. the reason is JSON is going to be schema flexible, and got no constraint about the "Keys" it has. and since JSON Schema says the top-level of document to persist is "object" and that matches to this input, hence it allows this document to persist.
 
demo@FREEPDB1> insert into orders( order_id,data) values( 10123,json[1,2,3] );
insert into orders( order_id,data) values( 10123,json[1,2,3] )
*
ERROR at line 1:
ORA-40875: JSON schema validation error
 
this insert got failed, since JSON Schema says the top-level of document to persist is "object" not array, so that confirms that some validation is happening upon json document using the constrained JSON schema.
 
demo@FREEPDB1> insert into orders( order_id,data) values( 10123, json{'Special Instructions':true} );
insert into orders( order_id,data) values( 10123, json{'Special Instructions':true} )
            *
ERROR at line 1:
ORA-40875: JSON schema validation error
 
this got failed, since "Special Instructions" can have either string or null (as per the constrained JSON schema) but not a Boolean value
 
To constraint the “KEYS” in JSON document, we need to redefine the Schema as
 
demo@FREEPDB1> alter table orders
  2  drop constraint orders_json_chk;
 
Table altered.
 
demo@FREEPDB1> alter table orders
  2  add constraint orders_json_chk
  3  check( data is json validate using q'# {
  4    "type" : "object",
  5    "o:length" : 1,
  6    "properties" :
  7    {
  8      "User" :
  9      {
 10        "type" : "string",
 11        "o:length" : 8,
 12        "o:preferred_column_name" : "User"
 13      },
 14      "PONumber" :
 15      {
 16        "type" : "number",
 17        "o:length" : 4,
 18        "o:preferred_column_name" : "PONumber"
 19      },
………………………………………..
………………………………………..
………………………………………..
173      "Special Instructions" :
174      {
175        "type" : "null",
176        "o:length" : 1,
177        "o:preferred_column_name" : "Special Instructions"
178      }
179    } , "additionalProperties":false
180  } #' ) ;
 
Table altered.
 
The additionalProperties keyword is used to control the handling of extra stuff, that is, properties whose names are not listed in the “properties” keyword, The value of the additionalProperties keyword is a schema that will be used to validate any “properties” in the instance that are not matched by properties. Setting the additionalProperties schema to false means no additional properties will be allowed
 
Once the constraint got recreated with “additionalProperties” of JSON schema set to false, and now repeating the first Insert statement fail like this
 
demo@FREEPDB1> insert into orders( order_id,data) values( 10123,json('{"hello":"world"}') );
insert into orders( order_id,data) values( 10123,json('{"hello":"world"}') )
            *
ERROR at line 1:
ORA-40875: JSON schema validation error
 
 
The reason behind this constraint violation error is, the JSON document got the key “hello” but that is not present in the JSON schema document mapped to the constraint,

No comments:

Post a Comment