Wednesday, August 14, 2024

Data share - Part I

Data sharing enable you to share DATA with one or more customers, Oracle Autonomous database enables you to create share using data sharing tool (which was a mix of managed ORDS services and DBMS_SHARE API), the data sharing consists of two steps.
 
  •          The provider provides the data share for access.
  •          The consumer receives the access from the published shares.
 
The Oracle Data sharing in general is based on the open delta sharing protocol, providing a simple REST based API to share data in paraquet format. Data is made accessible by data sharing provider (such as Oracle Autonomous database) to data sharing recipient (such as Power BI, Tableau, Apache spark or Java)
 
In the traditional methods of data sharing users typically follow one of these approaches
 
  •          Send data via email.
  •          Share data thorough FTP server
  •          Use application specific API for data extraction.
  •          Utilize vendor specific tools to copy required data
 
While the traditional methods work in general, they come up with their own drawbacks.
 
  •          Managing separate process for data extraction – labour intensive operation
  •          Extracting and duplication data is prone to staleness.
  •         Architecture / process is difficult to maintain and hard to scale.
  •          Redundant data extraction can introduce format compatibility issues.
 
The Morden way of data sharing must be open, secure, real-time, vendor-free and avoid the pitfalls of extracting and duplicating data for individual consumers of data in collaborative environment. Delta Sharing is an open protocol for secure real-time data exchange of large datasets that satisfies all these criteria, supported by multiple clients and program languages, and vendor agnostic. The delta sharing protocol is aimed at solving these following problems.
 
  •          Share data without copying it to another system.
  •          Producer controls the state of data (version of data)
  •          Be an open cross-platform solution.
  •          Support a wide range of clients such as Power BI, Tableau, Apache Spark, pandas and Java
  •          Provide flexibility to consume data using the tools of choice for BI, machine learning and AI use cases.
  •          Provide strong security, auditing, and governance.
  •          Scale to massive data sets
 
At a high level the delta sharing protocol works as follows
 
  •          The share provider user creates and publishes a data share that can be shared with one or more recipients.
  •          The share provider user creates and authorizes recipients.
  •           Every recipient will get a personal activation link to download their own .JSON profile with the necessary information to access their data share.
  •          The recipient subscribes to the data share provider by using the .JSON configuration profile.
  •          The recipient retrieves data from the share.
The overall architectural approach for delta sharing protocol looks like this
 
 


 
In the next series of blogpost we will see about step-by-step approach for setting up delta sharing protocol using Autonomous database and managed ORDS.
 
 
 


No comments:

Post a Comment