Tuesday, May 14, 2024

Oracle ADB Pre-Authenticated URL

Providing temporary access to Oracle database information is challenging, traditional access control mechanism, which granted access to certain data requires continuous management and updates. Limited duration or limited access count for certain tables or sql query result sets were not efficient and straightforward, leading to potential exposure of sensitive data and error prone.
 
Another big challenge is security, how to share critical data with your partners and customers.
 
To solve these problem, oracle has a great tool called ORDS (Oracle REST Data Services) and this tool lets you to talk to the database using a REST interface. It’s like a middleman , who helps you ask for data, make tables and change data sets all through REST clients, while ORDS is the default option in vast majority of use cases, there are some where it can’t be used.
 
  • Expiration dates : data provider wants to setup an expiration time for URL
  • Use count limits : data provider needs a cap on how many times a recipient  can use these links to access data.
  • Endpoint transparency: data provider want to hide the database name in the URL and use the abstraction endpoint.
 
All those above use cases can now handled by using a new feature : Pre-authenticated database URLs and this feature is not a replaced for ORDS, instead it’s a rather complementary technology that can help to solve some specific problems.
 
To generate the pre-authenticated URLs, the user need to run the PL/SQL code, with specified parameters
 
demo@ATP19C> ---
demo@ATP19C> --- Generate PAR url for table/view
demo@ATP19C> --- and set the expiration time to 120 minutes
demo@ATP19C> ---
demo@ATP19C>
demo@ATP19C> declare
  2     l_output clob;
  3  begin
  4     dbms_data_access.get_preauthenticated_url(
  5             schema_name => user,
  6             schema_object_name => 'EMP',
  7             expiration_minutes => 120 ,
  8             result => l_output );
  9
 10     dbms_output.put_line( l_output );
 11  end;
 12  /
{"status":"SUCCESS","id":"lpv6c1-1l6jJAVG8XCplqKGPaE7sdZ_t90-h58Y64UekWo8YLqCBHPCf2CJfsom1","preauth_url":"https://dataaccess.adb.us-ashburn-1.oraclecloudapps.com/adb/p/RBleuB3MqUYHp7uAEtpMZ613mIYQ7cEw0BZa6rD3YOM9UMfCj5DX3qUNqzFOEklFbUQaSpBgvCM/data","expiration_ts":"2024-05-07T08:18:28.801Z","expir
ation_count":null}
 
PL/SQL procedure successfully completed.
 
 
The user will get JSON that contains the URL to the dataset like this
 
{
  "status" : "SUCCESS",
  "id" : "lpv6c1-1l6jJAVG8XCplqKGPaE7sdZ_t90-h58Y64UekWo8YLqCBHPCf2CJfsom1",
  "preauth_url" : "https://dataaccess.adb.us-ashburn-1.oraclecloudapps.com/adb/p/RBleuB3MqUYHp7uAEtpMZ613mIYQ7cEw0BZa6rD3YOM9UMfCj5DX3qUNqzFOEklFbUQaSpBgvCM/data",
  "expiration_ts" : "2024-05-07T08:18:28.801Z",
  "expiration_count" : null
}
 
To read the data, simply paste the URL from JSON above in a browser or using the CURL command or any REST clients
 
demo@ATP19C> $ curl -i --location https://dataaccess.adb.us-ashburn-1.oraclecloudapps.com/adb/p/RBleuB3MqUYHp7uAEtpMZ613mIYQ7cEw0BZa6rD3YOM9UMfCj5DX3qUNqzFOEklFbUQaSpBgvCM/data
HTTP/1.1 200 OK
Date: Tue, 07 May 2024 06:19:04 GMT
Content-Type: application/json
Content-Length: 1862
Connection: keep-alive
Strict-Transport-Security: max-age=31536000;includeSubDomains
X-Content-Type-Options: nosniff
Request-Id: defaultaBX5iDhXOqlRPkRgvwYgoN0w4g34kBUB2bQiireG5B3WawsQyE3XKo53
X-Request-ID: 8daa595c33ae6586affd89d329eaf229
 
{"items":[{"EMPNO":7839,"ENAME":"KING","JOB":"PRESIDENT","HIREDATE":"1981-11-17T00:00:00","SAL":5001,"DEPTNO":10},{"EMPNO":7698,"ENAME":"BLAKE","JOB":"MANAGER","MGR":7839,"HIREDATE":"1981-05-01T00:00:00","SAL":2850,"DEPTNO":30},{"EMPNO":7782,"ENAME":"CLARK","JOB":"MANAGER","MGR":7839,"HIREDATE":"1981-06-09T00:00:00","SAL":2451,"DEPTNO":10}, …………………….
, …………………….
, …………………….
{"EMPNO":7876,"ENAME":"ADAMS","JOB":"CLERK","MGR":7788,"HIREDATE":"1987-05-23T00:00:00","SAL":1100,"DEPTNO":20},{"EMPNO":7900,"ENAME":"JAMES","JOB":"CLERK","MGR":7698,"HIREDATE":"1981-12-03T00:00:00","SAL":950,"DEPTNO":30},{"EMPNO":7934,"ENAME":"MILLER","JOB":"CLERK","MGR":7782,"HIREDATE":"1982-01-23T00:00:00","SAL":1301,"DEPTNO":10}],"hasMore":false,"limit":100,"offset":0,"count":14,"links":[{"rel":"self","href":"https://dataaccess.adb.us-ashburn-1.oraclecloudapps.com/adb/p/RBleuB3MqUYHp7uAEtpMZ613mIYQ7cEw0BZa6rD3YOM9UMfCj5DX3qUNqzFOEklFbUQaSpBgvCM/data"}]}
demo@ATP19C>
 
Similarly possible to set expiration count while generating pre-authenticated request URLs.
 
demo@ATP19C> ---
demo@ATP19C> --- Generate PAR url for table/view
demo@ATP19C> --- and set the expiration count to 'n' times
demo@ATP19C> ---
demo@ATP19C>
demo@ATP19C> declare
  2     l_output clob;
  3  begin
  4     dbms_data_access.get_preauthenticated_url(
  5             schema_name => user,
  6             schema_object_name => 'EMP',
  7             expiration_count => 2 ,
  8             result => l_output );
  9
 10     dbms_output.put_line( l_output );
 11  end;
 12  /
{"status":"SUCCESS","id":"Reatqul0fDB6u2EIsAvl8J0On7W7ul5SrYa0-uc5pv4EAHZzz2n6CFXqyBnE-G7v","preauth_url":"https://dataaccess.adb.us-ashburn-1.oraclecloudapps.com/adb/p/V9bzugzkD523o7s0ToXQg8SykHsVamFiem60kslurzi85av65VpzvyLGoLTYFMUabUQaSpBgvCM/data","expiration_ts":"2024-08-05T06:22:41.935Z","expir
ation_count":2}
 
PL/SQL procedure successfully completed.
 
demo@ATP19C>
demo@ATP19C>
demo@ATP19C> ---
demo@ATP19C> --- First two access to endpoint,should work,the third access should demo@ATP19C> --- fail
demo@ATP19C> ---
demo@ATP19C>
demo@ATP19C>
demo@ATP19C> $ curl -i --location https://dataaccess.adb.us-ashburn-1.oraclecloudapps.com/adb/p/V9bzugzkD523o7s0ToXQg8SykHsVamFiem60kslurzi85av65VpzvyLGoLTYFMUabUQaSpBgvCM/data
HTTP/1.1 200 OK
Date: Tue, 07 May 2024 06:25:28 GMT
Content-Type: application/json
Content-Length: 1862
Connection: keep-alive
Strict-Transport-Security: max-age=31536000;includeSubDomains
X-Content-Type-Options: nosniff
Request-Id: default9GNiJhw8bAytLX96I8WEvU5X4fUpcdnvri2eRQtK11os5STuDvdL27ge
X-Request-ID: dc1bdb1fb6d575bb11ac71af7a1b2418
 
{"items":[{"EMPNO":7839,"ENAME":"KING","JOB":"PRESIDENT","HIREDATE":"1981-11-17T00:00:00","SAL":5001,"DEPTNO":10},{"EMPNO":7698,"ENAME":"BLAKE","JOB":"MANAGER","MGR":7839,"HIREDATE":"1981-05-01T00:00:00","SAL":2850,"DEPTNO":30
…………………….
, …………………….
, …………………….
{"EMPNO":7900,"ENAME":"JAMES","JOB":"CLERK","MGR":7698,"HIREDATE":"1981-12-03T00:00:00","SAL":950,"DEPTNO":30},{"EMPNO":7934,"ENAME":"MILLER","JOB":"CLERK","MGR":7782,"HIREDATE":"1982-01-23T00:00:00","SAL":1301,"DEPTNO":10}],"hasMore":false,"limit":100,"offset":0,"count":14,"links":[{"rel":"self","href":"https://dataaccess.adb.us-ashburn-1.oraclecloudapps.com/adb/p/V9bzugzkD523o7s0ToXQg8SykHsVamFiem60kslurzi85av65VpzvyLGoLTYFMUabUQaSpBgvCM/data"}]}
demo@ATP19C>
demo@ATP19C>
demo@ATP19C>
demo@ATP19C> $ curl -i --location https://dataaccess.adb.us-ashburn-1.oraclecloudapps.com/adb/p/V9bzugzkD523o7s0ToXQg8SykHsVamFiem60kslurzi85av65VpzvyLGoLTYFMUabUQaSpBgvCM/data
HTTP/1.1 200 OK
Date: Tue, 07 May 2024 06:25:33 GMT
Content-Type: application/json
Content-Length: 1862
Connection: keep-alive
Strict-Transport-Security: max-age=31536000;includeSubDomains
X-Content-Type-Options: nosniff
Request-Id: defaultnTJ02ZbGdwlbDr4sC59QltnXyEe9pp7fki7fykwpJglUdrlM4akuraoS
X-Request-ID: db8366f1f1017cd50ef2a0f173295a8c
 
{"items":[{"EMPNO":7839,"ENAME":"KING","JOB":"PRESIDENT","HIREDATE":"1981-11-17T00:00:00","SAL":5001,"DEPTNO":10},{"EMPNO":7698,"ENAME":"BLAKE","JOB":"MANAGER","MGR":7839,"HIREDATE":"1981-05-01T00:00:00","SAL":2850,"DEPTNO":30},{"EMPNO":7782,"ENAME":"CLARK","JOB":"MANAGER","MGR":7839,"HIREDATE":"1981-06-09T00:00:00","SAL":2451,"DEPTNO":10}
…………………….
, …………………….
, …………………….
{"EMPNO":7900,"ENAME":"JAMES","JOB":"CLERK","MGR":7698,"HIREDATE":"1981-12-03T00:00:00","SAL":950,"DEPTNO":30},{"EMPNO":7934,"ENAME":"MILLER","JOB":"CLERK","MGR":7782,"HIREDATE":"1982-01-23T00:00:00","SAL":1301,"DEPTNO":10}],"hasMore":false,"limit":100,"offset":0,"count":14,"links":[{"rel":"self","href":"https://dataaccess.adb.us-ashburn-1.oraclecloudapps.com/adb/p/V9bzugzkD523o7s0ToXQg8SykHsVamFiem60kslurzi85av65VpzvyLGoLTYFMUabUQaSpBgvCM/data"}]}
demo@ATP19C>
demo@ATP19C>
demo@ATP19C> $ curl -i --location https://dataaccess.adb.us-ashburn-1.oraclecloudapps.com/adb/p/V9bzugzkD523o7s0ToXQg8SykHsVamFiem60kslurzi85av65VpzvyLGoLTYFMUabUQaSpBgvCM/data
HTTP/1.1 404 Not Found
Date: Tue, 07 May 2024 06:25:47 GMT
Content-Type: application/json
Content-Length: 83
Connection: keep-alive
Strict-Transport-Security: max-age=31536000;includeSubDomains
X-Content-Type-Options: nosniff
Request-Id: defaultq4gSpqVP0LyGDKOf1mVSS1jLW8zvVqc8zSozYOUsKPR8qFYwNrevC9c0
X-Request-ID: e629d4c2837ddc881ce18e40e009c7e4
 
{"code":"ADB-09009","message":"Pre-authenticated url not found or not authorized."}
demo@ATP19C>
 
 
Not only an entire table/view can be shared, but the users can specify the SQL query instead.
 
demo@ATP19C> ---
demo@ATP19C> --- Generate PAR url for query statement.
demo@ATP19C> ---
demo@ATP19C>
demo@ATP19C> declare
  2     l_output clob;
  3  begin
  4     dbms_data_access.get_preauthenticated_url(
  5             sql_statement => ' select empno , ename , hiredate from emp ',
  6             expiration_count => 2 ,
  7             result => l_output );
  8
  9     dbms_output.put_line( l_output );
 10  end;
 11  /
{"status":"SUCCESS","id":"nj4WXm9bscirplCDNgz9xtU1gHl6kgXApAdPk5h4irDPBdSYr6cJBSdXZ66YwK2d","preauth_url":"https://dataaccess.adb.us-ashburn-1.oraclecloudapps.com/adb/p/Iom13j0a_JjxCHIE7G_n17pmlqvhCreCoaVJz56JIEZf5WQ4_tHr1eGM0A-LrN5mh_rQ3yE6gR8/data","expiration_ts":"2024-08-05T09:21:47.497Z","expir
ation_count":2}
 
PL/SQL procedure successfully completed.
 
demo@ATP19C>
demo@ATP19C> $ curl -i --location https://dataaccess.adb.us-ashburn-1.oraclecloudapps.com/adb/p/Iom13j0a_JjxCHIE7G_n17pmlqvhCreCoaVJz56JIEZf5WQ4_tHr1eGM0A-LrN5mh_rQ3yE6gR8/data
HTTP/1.1 200 OK
Date: Tue, 07 May 2024 09:22:52 GMT
Content-Type: application/json
Content-Length: 1134
Connection: keep-alive
Strict-Transport-Security: max-age=31536000;includeSubDomains
X-Content-Type-Options: nosniff
Request-Id: defaultXdgraZsyDa3KMbdQIU79MH5Gmi8QrPxrR54rPPMA432Hked2Zkq8cwcl
X-Request-ID: 7b55f0999704ca81cc896616b42f33c2
 
{"items":[{"EMPNO":7839,"ENAME":"KING","HIREDATE":"1981-11-17T00:00:00"},{"EMPNO":7698,"ENAME":"BLAKE","HIREDATE":"1981-05-01T00:00:00"},{"EMPNO":7782,"ENAME":"CLARK","HIREDATE":"1981-06-09T00:00:00"}
, …………………….
, …………………….
, …………………….
,{"EMPNO":7900,"ENAME":"JAMES","HIREDATE":"1981-12-03T00:00:00"},{"EMPNO":7934,"ENAME":"MILLER","HIREDATE":"1982-01-23T00:00:00"}],"hasMore":false,"limit":100,"offset":0,"count":14,"links":[{"rel":"self","href":"https://dataaccess.adb.us-ashburn-1.oraclecloudapps.com/adb/p/Iom13j0a_JjxCHIE7G_n17pmlqvhCreCoaVJz56JIEZf5WQ4_tHr1eGM0A-LrN5mh_rQ3yE6gR8/data"}]}
demo@ATP19C>
 
 
Providing temporary access to Oracle database datasets has been complex in the past. However, the introduction of Pre-Authenticated Database URLs offers a simplified approach, especially for businesses seeking to monetize their data. Complementing ORDS, these URLs are specifically designed for controlled, time-bound access, making the process of selling data as a product straightforward. Whether for B2B interactions or direct consumer sales, these URLs offer a streamlined solution for data sharing.
 
This Pre-authenticated URLs are always read only and we cannot associate anything other than SELECT statement, failing to do so will result in error.
 
demo@ATP19C> ---
demo@ATP19C> --- when the query statement got anything apart from SELECT it will demo@ATP19C> --- fail
demo@ATP19C> ---
demo@ATP19C>
demo@ATP19C> declare
  2     l_output clob;
  3  begin
  4     dbms_data_access.get_preauthenticated_url(
  5             sql_statement => ' update emp set ename = lower(ename) ',
  6             expiration_count => 2 ,
  7             result => l_output );
  8
  9     dbms_output.put_line( l_output );
 10  end;
 11  /
{"status":"FAILURE","return_message":"ORA-20001: Error occurred during sql statement validation. Error Code: -903, Error Message: ORA-00903: invalid table name"}
 
PL/SQL procedure successfully completed.

Thursday, May 2, 2024

Using database links with ADB-S without wallet

In the previous post, we saw about how to setup a database link connecting two different ADB-S accessible over the public endpoint using wallet in place. In this blogpost we will see about how to establish database link connecting two different ADB-S accessible over the public endpoint without wallet in place.
 
On the target ADB details page, under Network, click Edit in the Mutual TLS (mTLS) authentication field & change the value to allow TLS Authentication by deselecting Require mutual TLS (mTLS) authentication and click update, the ADB lifecycle state changes to updating and post that mutual TLS (mTLS) authentication field changes to show Not required. 
 



To create database link to a public target, the target database must be accessible, some database including ADB limit access (using ACL), so make sure to enable target database to allow access from source database using database link, if we limit access with ACL, then make sure to find the outbound IP address of source database and allow that IP address to connect to your target database. By adding outbound IP address of source database to ACL of the target database.
 
On the source database
 
demo-user@ATP19C> select jt.*
  2  from v$pdbs, json_table( cloud_identity,'$.OUTBOUND_IP_ADDRESS[*]'
  3      columns( outbound_ips varchar2(20) path '$') ) jt;
 
OUTBOUND_IPS
----------------
150.136.133.92
 
On the target Autonomous database 
 

 
Then create credentials to access the target database, the username and the password for the dbms_cloud.create_credentials are the credentials to the target database.
 
demo-user@ATP19C> begin
  2     dbms_cloud.create_credential(
  3             credential_name =>'DB_LINK_CRED',
  4             username =>'DEMO_USER',
  5             password=>'Good2go1!1234' );
  6  end;
  7  /
 
PL/SQL procedure successfully completed.
 
Then create the database link to the target database using DBMS_CLOUD_ADMIN package, like this
 
demo-user@ATP19C> begin
  2     dbms_cloud_admin.create_database_link(
  3             db_link_name => 'DB_LINK_TEST',
  4             hostname => 'adb.us-ashburn-1.oraclecloud.com',
  5             port => 1522,
  6             service_name => 'g26be7c92912cdb_atp21c_low.adb.oraclecloud.com',
  7             credential_name =>'DB_LINK_CRED',
  8             directory_name => null );
  9  end;
 10  /
 
PL/SQL procedure successfully completed.
 
demo-user@ATP19C> select * from dual@DB_LINK_TEST;
 
D
-
X
 
To create a database link with DBMS_CLOUD_ADMIN.CREATE_DATABASE_LINK to a target Autonomous Database on a public endpoint using a secure TCP connection without a wallet, the directory_name parameter must be NULL.
 
demo-user@ATP19C> select banner_full from v$version;
 
BANNER_FULL
------------------------------------------------------------------------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.23.0.1.0
 
 
demo-user@ATP19C> select banner_full from v$version@db_link_test;
 
BANNER_FULL
------------------------------------------------------------------------
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.3.0.0.0


Friday, April 12, 2024

Using database links with ADB-S with wallet

Autonomous database serverless (ADB-S) users have an option to deploy their instance on either public or private endpoints, whether your connections are made over the pubic internet or through the Virtual client network (VCN), there is one thing in common, they are all secure and uses the Transport layer security (TLS1.2) protocol, so any connection between the client and database is encrypted and both the client and database can authenticate each other. When it comes to authenticating the client and server, there are couple of options.
 
  • Both client and server authenticate each other (mutual TLS)
  • Only the client authenticates the server (one-way TLS)
 
ADB-S uses the mutual TLS by default regardless of network configuration, so both the client and database can verify each other certificates. To complete server side authentication, any client connecting to an ADB-S instance must present their client credentials which can be downloaded as a zip file and contains SSO wallet, keystore, truststore and other network config files, this pretty much sums up how mTLS works and why you need to download a wallet to connect to autonomous database.
 
In this blogpost, we will see about how to create a database link from an ADB-S (source Autonomous Transaction processing 19c) to publicly accessible another ADB-S (Autonomous JSON database 21c)  with a wallet (mTLS)
 
Copy the Target database wallet, cwallet.sso containing the certificates for target database to an object storage bucket.
 
demo-user@ATP19C> variable uri varchar2(200)
demo-user@ATP19C> exec :uri := 'https://objectstorage.us-ashburn-1.oraclecloud.com/n/idcglquusbz6/b/MY_DEMO_BUCKET/o/DBLINK_TEST/';
 
PL/SQL procedure successfully completed.
 
demo-user@ATP19C> select object_name
  2  from dbms_cloud.list_objects('my_demo_cred',:uri);
 
no rows selected

demo-user@ATP19C>
 
use dbms_cloud.get_object to upload the target database wallet into a directory created / available  on the source database.
 
demo-user@ATP19C> begin
  2     dbms_cloud.get_object(
  3             credential_name =>'my_demo_cred',
  4             object_uri => 'https://objectstorage.us-ashburn-1.oraclecloud.com/n/idcglquusbz6/b/MY_DEMO_BUCKET/o/DBLINK_TEST/cwallet.sso',
  5             directory_name => 'DATA_PUMP_DIR' );
  6  end;
  7  /
 
PL/SQL procedure successfully completed.
 
demo-user@ATP19C> select object_name
  2  from table( dbms_cloud.list_files('DATA_PUMP_DIR') )
  3  order by created desc
  4  fetch first 1 row only;
 
OBJECT_NAME
------------------------------
cwallet.sso
 
 
on the ADB-S instance create credentials to access the target database, the username and the password for the dbms_cloud.create_credentials are the credentials to the target database.
 
demo-user@ATP19C> begin
 2    dbms_cloud.create_credential(
 3            credential_name => 'target_db_cred',
 4            username => 'demo_user',
 5            password => 'Good2go1!1234' );
 6 end;
 7 /
 
PL/SQL procedure successfully completed.
 
Then create the database link to the target database using DBMS_CLOUD_ADMIN package, like this
 
demo-user@ATP19C> begin
 2    dbms_cloud_admin.create_database_link(
 3            db_link_name=>'target_db_link',
 4            hostname => 'adb.us-ashburn-1.oraclecloud.com',
 5            port => 1522,
 6            service_name => 'g26be7c92912cdb_ajd21c_low.adb.oraclecloud.com',
 7            credential_name => 'target_db_cred',
 8            directory_name => 'DATA_PUMP_DIR' );
 9 end;
 10 /
 
PL/SQL procedure successfully completed.
 
Then when we try to access the data on target database using database link, it fails like this
 
demo-user@ATP19C> select * from dual@target_db_link;
select * from dual@target_db_link
                  *
ERROR at line 1:
ORA-01017: invalid username/password; logon denied
ORA-02063: preceding line from TARGET_DB_LINK
 
But the real problem is not due to the Incorrect password, instead it was due to USERNAME listed in lowercase, instead it should be in upper case.
 
demo-user@ATP19C> begin
  2     dbms_cloud.create_credential(
  3             credential_name => 'target_db_cred',
  4             username => 'DEMO_USER',
  5             password => 'Good2go1!1234' );
  6  end;
  7  /
 
PL/SQL procedure successfully completed. 

Once that was fixed, the database link works perfect.
 
demo-user@ATP19C> select host from dba_db_links where db_link = 'TARGET_DB_LINK';
 
HOST
-----------------------------------------------------------------------------------------------
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST="adb.us-ashburn-1.oraclecloud.com")(PORT=1522))
(CONNECT_DATA=(SERVICE_NAME=g26be7c92912cdb_ajd21c_low.adb.oraclecloud.com))
(SECURITY=(MY_WALLET_DIRECTORY="/u03/dbfs/E47379BFF4313E4EE0539118000A6636/data/dpdump")
(SSL_SERVER_DN_MATCH=TRUE)))
 
demo-user@ATP19C>
demo-user@ATP19C> select banner_full from v$version;
 
BANNER_FULL
-------------------------------------------------------------------------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.23.0.1.0
 
 
demo-user@ATP19C> select banner_full from v$version@target_db_link ;
 
BANNER_FULL
-------------------------------------------------------------------------
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.3.0.0.0
 

Sunday, March 17, 2024

Natural Language to SQL generation - Part II

Using natural language to query the data is an easy way to answer business questions. Which we have covered the basics in the previous blog post, in this post we will see about how can this work on my data when tables and column names are meaningless? It is possible when using Autonomous database in place. There is no magic, if the tables and column names are not descriptive, then we can help the LLM interpret the meaning of tables and columns using the build-in database feature called “Comments”. Yes comments are descriptive notes about tables and column purpose or usage. And better the comment, the more likely the LLM will know how to use the table or column to generate the right query.
 
Let’s take an example, my database has three tables. their table and column names are meaningless.
 
rajesh@ATP19C> create table t1(
  2    c1 number,
  3    c2 varchar2(2),
  4    c3 varchar2(40),
  5    c4 varchar2(30),
  6    c5 number,
  7    c6 varchar2(20),
  8    c7 number,
  9    c8 varchar2(11),
 10    c9 number,
 11    c10 varchar2(40) );
 
rajesh@ATP19C> create table t2(
  2    c1 number,
  3    c2 varchar2(20),
  4    c3 varchar2(40),
  5    c4 varchar2(1),
  6    c5 number,
  7    c6 varchar2(20),
  8    c7 varchar2(40),
  9    c8 varchar2(10),
 10    c9 varchar2(30),
 11    c10 number ,
 12    c11 varchar2(40),
 13    c12 number,
 14    c13 number,
 15    c14 varchar2(25),
 16    c15 varchar2(30),
 17    c16 number,
 18    c17 varchar2(50),
 19    c18 varchar2(14),
 20    c19 number,
 21    c20 number,
 22    c21 Date,
 23    c22 Date,
 24    c23    varchar2(1)  );
 
rajesh@ATP19C> create table t3(
  2    c1 number,
  3    c2 number,
  4    c3 date,
  5    c4 number,
  6    c5 number,
  7    c6 number,
  8    c7 number );   
 
There is a zero chance, that a Natural language query will know that these tables represent, countries , customers and sales Information, we can fix that ambiguity by adding database comments.
 
rajesh@ATP19C> set feedback off
rajesh@ATP19C> insert /*+ append */ into t1 select * from sh.countries;
rajesh@ATP19C> insert /*+ append */ into t2 select * from sh.customers;
rajesh@ATP19C> insert /*+ append */ into t3 select * from sh.sales;
rajesh@ATP19C> commit;
rajesh@ATP19C>
rajesh@ATP19C> comment on table t1 is 'contains country name, region , subregion and ISO code';
rajesh@ATP19C> comment on column t1.c1 is 'country ids. use this column to join with other tables';
rajesh@ATP19C> comment on column t1.c2 is 'country ISO code';
rajesh@ATP19C> comment on column t1.c3 is 'country name';
rajesh@ATP19C> comment on column t1.c4 is 'country subregion name';
rajesh@ATP19C> comment on column t1.c5 is 'country subregion ids';
rajesh@ATP19C> comment on column t1.c6 is 'country region name';
rajesh@ATP19C> comment on column t1.c7 is 'country region ids';
rajesh@ATP19C>
rajesh@ATP19C>
rajesh@ATP19C> comment on table t2 is 'contains customer information like name, gender, date of birth, marital status, postal address';
rajesh@ATP19C> comment on column t2.c1 is 'customer ids. primary key column use this column to join with other tables';
rajesh@ATP19C> comment on column t2.c2 is 'first name of the customer';
rajesh@ATP19C> comment on column t2.c3 is 'last name of the customer';
rajesh@ATP19C> comment on column t2.c4 is 'customer gender';
rajesh@ATP19C> comment on column t2.c5 is 'customer year of birth';
rajesh@ATP19C> comment on column t2.c6 is 'customer marital status';
rajesh@ATP19C> comment on column t2.c7 is 'customer street address';
rajesh@ATP19C> comment on column t2.c8 is 'postal code of the customer';
rajesh@ATP19C> comment on column t2.c9 is 'city where the customer lives';
rajesh@ATP19C> comment on column t2.c11 is 'customer geography: state or province';
rajesh@ATP19C> comment on column t2.c13 is 'foreign key to the countries table';
rajesh@ATP19C> comment on column t2.c14 is 'customer main phone number';
rajesh@ATP19C> comment on column t2.c15 is 'customer income level';
rajesh@ATP19C> comment on column t2.c16 is 'customer credit limit';
rajesh@ATP19C> comment on column t2.c17 is 'customer email id';
rajesh@ATP19C>
rajesh@ATP19C> comment on table t3 is 'contains sales information about each customers';
rajesh@ATP19C> comment on column t3.c2 is 'foreign key to the customer table';
rajesh@ATP19C> comment on column t3.c6 is 'product quantity sold with the transaction';
rajesh@ATP19C> comment on column t3.c7 is 'invoiced amount to the customer';
rajesh@ATP19C>
rajesh@ATP19C>
 
 
that’s it, those meaning less tables and column names can be understood by LLM using Select AI. This Select AI “profile” encapsulates the information needed to interact with LLM. It includes details like AI provider, the model to use, the source tables used for natural language queries and whether comments should be passed to the model for SQL generation etc.
 
rajesh@ATP19C> begin
  2   dbms_cloud_ai.create_profile( 'MY_AI_PROFILE',
  3            '{"provider": "cohere",
  4            "credential_name": "MY_AI_CRED",
  5            "comments":"true",
  6            "object_list": [
  7                                        {"owner":"RAJESH","name":"T1"},
  8                                        {"owner":"RAJESH","name":"T2"},
  9                                        {"owner":"RAJESH","name":"T3"}
 10                                      ]      }' );
 11  end;
 12  /
 
PL/SQL procedure successfully completed.
 
Now we can start asking questions using natural language against the complex schema, even though the tables and column names are meaningless, the LLM is able to identify the tables and column names through comments and generate queries.
 
rajesh@ATP19C> exec dbms_cloud_ai.set_profile('MY_AI_PROFILE');
 
PL/SQL procedure successfully completed.
 
rajesh@ATP19C> Select AI showsql how many customers are there ;
 
RESPONSE
--------------------------------------------------------------------------------
SELECT COUNT(*) AS customer_count
FROM RAJESH.T2
 
 
rajesh@ATP19C> Select AI showsql what are our total sales;
 
RESPONSE
--------------------------------------------------------------------------------
SELECT SUM(T3.C7) AS TotalSales
FROM T1 T1 JOIN T2 T2 ON T1.C1 = T2.C13 JOIN T3 ON T2.C1 = T3.C2
 
 
rajesh@ATP19C> Select AI showsql list total sales by gender;
 
RESPONSE
--------------------------------------------------------------------------------
SELECT t3.c4, SUM(t3.c6)
FROM rajesh.t3 JOIN rajesh.t2 ON rajesh.t3.c1 = rajesh.t2.c1
GROUP BY t3.c4
ORDER BY SUM(t3.c6) DESC
 
 
rajesh@ATP19C> Select AI showsql list customers born in the year 1980;
 
RESPONSE
--------------------------------------------------------------------------------
SELECT t2.C16 AS customer_credit_limit, t2.C4 AS customer_gender, t2.C6 AS custo
mer_marital_status, t2.C11 AS customer_province, t2.C9 AS customer_city, t2.C15
AS customer_income_level, t2.C5 AS customer_year_of_birth, t2.C17 AS customer_em
ail, t2.C3 AS customer_last_name, t2.C2 AS customer_first_name
FROM rajesh.t2
WHERE t2.C5 = 1980
 
 
Properly describing your data will help you use natural language to get answers. Comments not only help an LLM successfully formulate queries, they also help you understand your data as well, starting with Oracle database 23c, not just comments even Annotations can help LLM to get the best out of data models, stay tuned.
 

Monday, March 11, 2024

Natural Language to SQL generation - Part I

Oracle Autonomous database enable you to query data using Natural language, by combining generative AI large language models (LLM) with Oracle SQL empowers you to describe what we want (declarative intent) and let the database generate the SQL query relevant to your schema.
 
Enabling this feature is a new package DBMS_CLOUD_AI available in autonomous database, that enables the use of LLM for generating SQL’s from natural language prompts, this package provides the access to the user specified LLM along with the knowledge of user accessible database metadata. This enables producing runnable SQL queries applicable to the schema.  In this blogpost we will see how to get started with Select AI in the autonomous database.
 
Sign into your autonomous database instance with admin privileged account and add your user account to the ACL list and grant access to DBMS_CLOUD_AI package.
 
admin@ATP19C> grant connect,resource to rajesh identified by "*************";
 
Grant succeeded.
 
admin@ATP19C> grant execute on dbms_cloud_ai to rajesh;
 
Grant succeeded.
 
admin@ATP19C> begin
  2    dbms_network_acl_admin.append_host_ace(
  3        host => 'api.cohere.ai',
  4        ace  => xs$ace_type(privilege_list => xs$name_list('http'),
  5                            principal_name => 'RAJESH',
  6                            principal_type => xs_acl.ptype_db));
  7  end;
  8  /
 
PL/SQL procedure successfully completed.
 
admin@ATP19C>
 
for this setup I will be using the Cohere as “AI Provider”. Although openAI an Azure openAI also currently supported in Autonomous database.
 
To enable Cohere to generate SQL from natural language prompts, obtain API keys from Cohere free Trial account (https://dashboard.cohere.com/) and click on the Dashboard, and click the API keys on the left navigation.
 
 
Then create a database credential to your AI provider account, in case of Cohere, the password is uniquely generated token for API usage.



 
admin@ATP19C> conn rajesh/"*************"@atp19c_vpn
Connected.
rajesh@ATP19C>
rajesh@ATP19C> begin
  2     dbms_cloud.create_credential(
  3             credential_name => 'MY_AI_CRED',
  4             username => <your_signin_user_account_for_cohere>,
  5             password => <your_uniquely_generated_token_for_API_usage>);
  6  end;
  7  /
 
PL/SQL procedure successfully completed.
 
Then, create your AI profile, with the list of schemas to consider and optionally any tables.
 
rajesh@ATP19C> begin
  2   dbms_cloud_ai.create_profile( 'MY_AI_PROFILE',
  3              '{"provider": "cohere",
  4             "credential_name": "MY_AI_CRED",
  5             "object_list": [
  6                                               {"owner":"SH","name":"PROMOTIONS"},
  7                                               {"owner":"SH","name":"PRODUCTS"},
  8                                               {"owner":"SH","name":"CUSTOMERS"},
  9                                               {"owner":"SH","name":"TIMES"},
 10                                               {"owner":"SH","name":"COUNTRIES"},
 11                                               {"owner":"SH","name":"COSTS"},
 12                                               {"owner":"SH","name":"CHANNELS"},
 13                                               {"owner":"SH","name":"SALES"},
 14                                               {"owner":"SH","name":"SUPPLEMENTARY_DEMOGRAPHICS"}
 15                                             ]       }' );
 16  end;
 17  /
 
PL/SQL procedure successfully completed.
 
Finally, we need to set our AI profile for each database session.
 
rajesh@ATP19C> exec dbms_cloud_ai.set_profile('MY_AI_PROFILE');
 
PL/SQL procedure successfully completed.
 
Let’s look at few examples, we will start simply. How many customers are there ? using the keyword AI we run the SQL commands, we see the result set as well as  SQL query generated by our LLM adding the keyword “showsql” after the “Select AI”
 
rajesh@ATP19C> Select AI how many customers are there;
 
CUSTOMER_COUNT
--------------
         55500
 
rajesh@ATP19C> Select AI showsql how many customers are there;
 
RESPONSE
---------------------------------------------------
SELECT COUNT(*) AS customer_count
FROM SH.CUSTOMERS
 
The next query highlights using the LLM broader knowledge. Find the customers that are in India.
 
rajesh@ATP19C> Select AI how many customers are there in the country India;
 
CUSTOMER_COUNT
--------------
             0
 
rajesh@ATP19C> Select AI showsql how many customers are there in the country India;
 
RESPONSE
---------------------------------------------------------------------------------------------------------------------------------------------------------
SELECT COUNT(*) AS customer_count
FROM SH.CUSTOMERS c
JOIN SH.COUNTRIES country ON c.COUNTRY_ID = country.COUNTRY_ID
WHERE country.COUNTRY_NAME = 'India'
 
 
You can use the chat option to interact with the LLM as you would a chatbot with a single message. In the following example, a developer may need to create a table. By describing the table you want, it generates a CREATE TABLE statement you can use directly or modify to suite your needs
 
 
rajesh@ATP19C> Select AI chat show the create table statement for a table called agents with agent name and age;
 
RESPONSE
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Here is the SQL code to create a table called `Agents` with columns for agent name and age:
```sql
CREATE TABLE Agents (
    agent_name VARCHAR(255) NOT NULL,
    age INT NOT NULL
);
```
 
This code creates a table with a `VARCHAR` column for the agent's name and an `INT` column for the agent's age. The `NOT NULL` constraint prevents from inserting null values in these columns.
 
You can add more columns as needed, such as an ID column for uniqueness and indexing for faster lookup.
 
Once the table is created, you can add data to it using SQL insert statements or other data manipulation techniques.
 
Even a complex sql like this
 
rajesh@ATP19C> Select AI what is Top 3 selling products;
 
PROD_NAME                                          SUM(AMOUNT_SOLD)       RANK
-------------------------------------------------- ---------------- ----------
Envoy Ambassador                                         15011642.5          1
Mini DV Camcorder with 3.5" Swivel LCD                    8314815.4          2
17" LCD w/built-in HDTV Tuner                            7189171.77          3
 
rajesh@ATP19C> Select AI showsql what is Top 3 selling products;
 
RESPONSE
---------------------------------------------------------------------------------
SELECT p.prod_name, SUM(amount_sold), RANK() OVER (ORDER BY SUM(amount_sold) DESC) as rank
FROM sh.sales, sh.products p
WHERE sales.prod_id = p.prod_id
GROUP BY p.prod_name
ORDER BY rank
FETCH FIRST 3 ROWS ONLY
 
You can also use ‘chat’ to understand what a query is doing, as shown below. 
 
rajesh@ATP19C> select AI chat
  2  with rws as (
  3  select amount_sold , prod_category, cust_gender
  4  from sh.sales
  5  join sh.products using (prod_id)
  6  join sh.customers using (cust_id)
  7  group by prod_category, cust_gender )
  8  select *
  9  from rws
 10  pivot( count(*) for cust_gender in ('F','M') )
 11  /
 
RESPONSE
-----------------------------------------------------------------------------------
 
Sorry, unfortunately a valid SELECT statement could not be generated for your natural language prompt. Here is some more information to help you further:
 
Here is the output after executing the SQL query on the given table and subquery:
```sql
with rws as (
    select amount_sold, prod_category, cust_gender
    from sh.sales
    join sh.products using (prod_id)
    join sh.customers using (cust_id)
    group by prod_category, cust_gender
)
select *
from rws
pivot(count(*) for cust_gender in ('F', 'M'))
```
| ProdCategory | AmountSold | CustGender | SalesCount |
| ------------ | ----------- | ----------- | ----------- |
| Food        | 1245.78     | M         | 262 |
| Food        | 3054.68     | M         | 390 |
| Food        | 1434.34     | M         | 337 |
| Food        | 738.28      | M         | 165 |
| Total Foods | 6263.88     | M         | 1030 |
| Dresses     | 4699.66     | F         | 704 |
| Dresses     | 3789.84     | F         | 557 |
| Dresses     | 497.40      | F         | 83 |
| Total Dresses| 8972.86     | F         | 1344 |
| Total       | 15226.74    | M/F        | 2237 |
 
The pivot table summarizes the amount sold and the number of sales according to the product category and customer gender. The amounts are aggregated with the `count` function and categorized into 'M' and 'F' for male and female customer genders.  The `total` row is a compilation of the aggregated am
ounts and gender counts.
 
Are there any other outputs you would like to know? Feel free to ask and I'll be more than happy to assist you!
 
 
The ability to generate SQL queries using natural language to get useful results sounds amazing, you can get the ease of use without SQL expertise, since this feature is integrated with database because it is SQL it is readily available with analytics and applications.
 
Finally for the stateless environment like Database actions SQL worksheet in Autonomous database, we can use the DBMS_CLOUD_AI.GENERATE functions in our sql query, since this allows to specify the AI profile directly for stateless invocations.
 
rajesh@ATP19C> col response for a80
rajesh@ATP19C> select dbms_cloud_ai.generate(prompt=>'how many customers are there',
  2             profile_name=>'MY_AI_PROFILE',
  3             action=>'showsql')  as response
  4  from dual;
 
RESPONSE
--------------------------------------------------------------------------------
SELECT COUNT(*) AS customer_count
FROM SH.CUSTOMERS