Wednesday, January 11, 2017

Loading JSON from external file system

Very recently got a request from an internal team to load a JSON document (from external file system) into Oracle database.

Thought of using DBMS_LOB API to read the file contents and load them into application tables, but it got failed miserably.

demo@ORA12C> create table t(doc clob
  2     constraint t_valid_json
  3     check(doc IS JSON) );

Table created.

demo@ORA12C> declare
  2     l_bfile bfile;
  3     l_temp clob;
  4  begin
  5     insert into t(doc)
  6             values( empty_clob() )
  7             returning doc into l_temp;
  8
  9     l_bfile := bfilename('ETL_DATA_DIR','sample_json.txt');
 10
 11     dbms_lob.fileopen(l_bfile);
 12     dbms_lob.loadfromfile(
 13             dest_lob=>l_temp,
 14             src_lob=>l_bfile,
 15             amount=>dbms_lob.getlength(l_bfile)) ;
 16     dbms_lob.fileclose(l_bfile);
 17     commit;
 18
 19     exception
 20             when others then
 21             if dbms_lob.isopen(l_bfile) = 1 then
 22                     dbms_lob.fileclose(l_bfile);
 23             end if;
 24             raise;
 25  end;
 26  /
declare
*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error: invalid LOB locator specified: ORA-22275
ORA-06512: at "SYS.DBMS_LOB", line 897
ORA-06512: at line 21
ORA-02290: check constraint (DEMO.T_VALID_JSON) violated

This error message is not due to invalid JSON document (infact the contents inside the file "sample_json.txt" is a valid JSON document, validated it through http://jsonlint.com/ portal.)

This error message is due to inserting empty_clob() into the table (in the above block).

demo@ORA12C> insert into t(doc) values( empty_clob() );
insert into t(doc) values( empty_clob() )
*
ERROR at line 1:
ORA-02290: check constraint (DEMO.T_VALID_JSON) violated

So, how to over come this, we could see if external table could help us. so created a sql loader control file and asked sql-loader to generate external table for us.

demo@ORA12C> host
Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\179818>type d:\ctl.txt
load data
infile *
truncate into table t
(
lob_filenm filler char(100),
doc lobfile(lob_filenm) TERMINATED BY EOF)
begindata
d:\sample_json.txt

C:\Users\179818>sqlldr demo/demo@ora12c control=d:\ctl.txt log=d:\log.txt EXTERNAL_TABLE = GENERATE_ONLY

SQL*Loader: Release 12.1.0.2.0 - Production on Wed Jan 11 12:42:51 2017

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Path used:      External Table
SQL*Loader-803: Field type LOBFILE not supported by External Table

C:\Users\179818>exit

demo@ORA12C>

But then it doesn't help us, so created an external table manually and loaded those JSON document like this.

demo@ORA12C> CREATE TABLE t_load_json (
  2    doc CLOB
  3  )
  4  ORGANIZATION EXTERNAL
  5  (
  6    TYPE ORACLE_LOADER
  7    DEFAULT DIRECTORY ETL_DATA_DIR
  8    ACCESS PARAMETERS
  9    (
 10      RECORDS DELIMITED BY NEWLINE
 11     FIELDS TERMINATED BY ','
 12      MISSING FIELD VALUES ARE NULL
 13      (
 14        clob_filename     CHAR(100)
 15      )
 16      COLUMN TRANSFORMS (doc FROM LOBFILE (clob_filename) FROM (ETL_DATA_DIR) CLOB)
 17    )
 18    LOCATION ('sample_data.txt')
 19  )
 20  REJECT LIMIT UNLIMITED ;

Table created.

demo@ORA12C> select dbms_lob.getlength(doc) from t_load_json;

DBMS_LOB.GETLENGTH(DOC)
-----------------------
                  11877

1 row selected.

demo@ORA12C> insert into t(doc)
  2  select doc
  3  from t_load_json;

1 row created.

demo@ORA12C> commit;

Commit complete.

demo@ORA12C> select dbms_lob.getlength(doc) from t where doc IS JSON;

DBMS_LOB.GETLENGTH(DOC)
-----------------------
                  11877

1 row selected.

demo@ORA12C>

You might still wonder, why don't we load this JSON from sql-loader itself, sql-loader has a couple of bugs in 12c (12.1.0.2) for loading JSON, here is a quick test case for that.

A simple JSON document having one key value pair got loaded successfully, with an insert statement

demo@ORA12C> insert into t(doc) values(' {"department":"Sales"} ');

1 row created.

demo@ORA12C> rollback;

Rollback complete.

But when loading the same JSON document from sql-loader it goes like this, Here is my control file, for this load process.

C:\Users\179818>type d:\ctl.txt
load data
infile *
truncate into table t
( doc )
begindata
{"department":"Sales"}

When initiated the load process, it completed with zero rows processed.

C:\Users\179818>sqlldr demo/demo@ora12c control=d:\ctl.txt log=d:\log.txt

SQL*Loader: Release 12.1.0.2.0 - Production on Wed Jan 11 14:15:35 2017

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Path used:      Conventional
Commit point reached - logical record count 1

Table T:
  0 Rows successfully loaded.

Check the log file:
  d:\log.txt
for more information about the load.

Peeking into the log file , it shows this.

C:\Users\179818>type d:\log.txt

SQL*Loader: Release 12.1.0.2.0 - Production on Wed Jan 11 14:15:35 2017

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Control File:   d:\ctl.txt
Data File:      d:\ctl.txt
  Bad File:     d:\ctl.bad
  Discard File:  none specified

 (Allow all discards)

Number to load: ALL
Number to skip: 0
Errors allowed: 50
Bind array:     64 rows, maximum of 256000 bytes
Continuation:    none specified
Path used:      Conventional

Table T, loaded from every logical record.
Insert option in effect for this table: TRUNCATE

   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
DOC                                 FIRST     1           CHARACTER

Record 1: Rejected - Error on table T.
ORA-02290: check constraint (DEMO.T_VALID_JSON) violated


Table T:
  0 Rows successfully loaded.
  1 Row not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.


Space allocated for bind array:                    256 bytes(64 rows)
Read   buffer bytes: 1048576

Total logical records skipped:          0
Total logical records read:             1
Total logical records rejected:         1
Total logical records discarded:        0

Run began on Wed Jan 11 14:15:35 2017
Run ended on Wed Jan 11 14:15:39 2017

Elapsed time was:     00:00:03.76
CPU time was:         00:00:00.31

So the JSON document that loaded via INSERT statement but not through sql-loader, is due to the presence of the bug in 12c (12.1.0.2) database ( i hope patches will be available at Oracle support for this. - right now don't have access to Oralce 12c Release 2 database to verify this)

1 comment:

  1. Note To Self-

    Sorry to miss the contents of file "sample_data.txt" used in the above external table location.

    C:\Users\179818>type d:\sample_data.txt
    sample_json.txt
    C:\Users\179818>

    ReplyDelete