This will be the continuation of the previous
blog post.
To efficiently scan data in object store,
Autonomous database (ADB) provides various optimizations for all common data
types, table formats and directory structures in object storage. With the
recently introduced new feature (called Implicit partitioning) for external
tables, this enables query performance by enable pruning of data based on
logical conditions. This new feature also make it much easier to create and
manage these partitioned tables, by simply pointing the root directory for your
tables (data) files. That is, we can easily create external tables across
thousands of partitions and millions of files within a single command.
External tables with Implicit
partitioning automatically discover the underlying partition structure based on
the hierarchical structure of files and add these additional partition keys to
external table structures.
Here is the structure of our object storage.
demo@ATP19C> select
object_name, bytes
2 from dbms_cloud.list_objects(
3 credential_name => 'API_KEY_CRED',
4 location_uri => :uri);
OBJECT_NAME
BYTES
--------------------------------------------------- ----------
Sales/ 0
Sales/Country=UK/ 0
Sales/Country=UK/year=2024/ 0
Sales/Country=UK/year=2024/data_deptno_10.parquet 2163
Sales/Country=UK/year=2024/data_deptno_20.parquet 2218
Sales/Country=US/ 0
Sales/Country=US/year=2023/ 0
Sales/Country=US/year=2023/data_deptno_30.parquet 2257
8 rows selected.
To emphasize this external table as
Automatic Implicit partitioning, we need to create it as using DBMS_CLOUD API
demo@ATP19C> begin
2 dbms_cloud.create_external_table(
3 table_name => 'xt_implicit_hive_demo'
4 , credential_name => 'API_KEY_CRED'
5 , file_uri_list => :uri
6 , format => '{"schema":"first","type":"parquet","implicit_partition_type":"hive"}');
7 end;
8 /
PL/SQL procedure successfully
completed.
Partition columns are automatically detected,
and implicit partitioning is enabled by setting the format option IMPLICIT_PARTITION_TYPE to HIVE.
To detect the partition columns ADB
starts searching from the beginning of the URL path, specified by FILE_URI_LIST for "=" , when
found the left part of = is treated as the column name and the
right part of = as value, then the search continues for
next equal sign "=" until after the first partition key
column and so on.
Query using implicit partitioning.
demo@ATP19C> select * from xt_implicit_hive_demo;
demo@ATP19C> select
count(*) from xt_implicit_hive_demo where country ='US' and year = '2023';
COUNT(*)
----------
6
By specifying predicates country ='US'
and year = '2023', forces the database to read only the underlying directory
that corresponds to this filter.
Moreover, we can add new files and
directories in object storage, this new data is automatically recognized by
ADB, since all partition pruning is done at run-time.
demo@ATP19C> select
count(*) from xt_implicit_hive_demo where country ='US' and year = '2025';
COUNT(*)
----------
0
Once we upload a sample data for the year
2025, ADB automatically prunes partitions at runtime.
demo@ATP19C> select
object_name, bytes
2 from dbms_cloud.list_objects(
3 credential_name => 'API_KEY_CRED',
4 location_uri => :uri);
OBJECT_NAME BYTES
--------------------------------------------------- ----------
Sales/ 0
Sales/Country=UK/ 0
Sales/Country=UK/year=2024/ 0
Sales/Country=UK/year=2024/data_deptno_10.parquet 2163
Sales/Country=UK/year=2024/data_deptno_20.parquet 2218
Sales/Country=US/ 0
Sales/Country=US/year=2023/ 0
Sales/Country=US/year=2023/data_deptno_30.parquet 2257
Sales/Country=US/year=2025/ 0
Sales/Country=US/year=2025/data_deptno_40.parquet 2414
10 rows selected.
demo@ATP19C> select
count(*) from xt_implicit_hive_demo where country ='US' and year = '2025';
COUNT(*)
----------
8
This shows how ADB automatically manages
partitions by enhancing query performance by skipping unnecessary data segments
based on predicates, if the underlying data structure of the object store
changes, then there is no need to modify / Sync anything on ADB.
2 from dbms_cloud.list_objects(
3 credential_name => 'API_KEY_CRED',
4 location_uri => :uri);
--------------------------------------------------- ----------
Sales/ 0
Sales/Country=UK/ 0
Sales/Country=UK/year=2024/ 0
Sales/Country=UK/year=2024/data_deptno_10.parquet 2163
Sales/Country=UK/year=2024/data_deptno_20.parquet 2218
Sales/Country=US/ 0
Sales/Country=US/year=2023/ 0
Sales/Country=US/year=2023/data_deptno_30.parquet 2257
2 dbms_cloud.create_external_table(
3 table_name => 'xt_implicit_hive_demo'
4 , credential_name => 'API_KEY_CRED'
5 , file_uri_list => :uri
6 , format => '{"schema":"first","type":"parquet","implicit_partition_type":"hive"}');
7 end;
8 /
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO COUNTRY YEAR
_____ ______ ____________ _____ ______________ _______ _______ _______ _______ _____
7839 KING PRESIDENT 17-NOV-1981 5000 10 UK 2024
7782 CLARK MANAGER 7839 09-JUN-1981 2450 10 UK 2024
7934 MILLER CLERK 7782 23-JAN-1982 1300 10 UK 2024
7566 JONES MANAGER 7839 02-APR-1981 2975 20 UK 2024
7788 SCOTT ANALYST 7566 19-APR-1987 3000 20 UK 2024
7902 FORD ANALYST 7566 03-DEC-1981 3000 20 UK 2024
7369 SMITH CLERK 7902 17-DEC-1980 800 20 UK 2024
7876 ADAMS CLERK 7788 23-MAY-1987 1100 20 UK 2024
7698 BLAKE MANAGER 7839 01-MAY-1981 2850 30 US 2023
7499 ALLEN SALESMAN 7698 20-FEB-1981 1600 300 30 US 2023
7521 WARD SALESMAN 7698 22-FEB-1981 1250 500 30 US 2023
7654 MARTIN SALESMAN 7698 28-SEP-1981 1250 1400 30 US 2023
7844 TURNER SALESMAN 7698 08-SEP-1981 1500 0 30 US 2023
7900 JAMES CLERK 7698 03-DEC-1981 950 30 US 2023
----------
6
----------
0
2 from dbms_cloud.list_objects(
3 credential_name => 'API_KEY_CRED',
4 location_uri => :uri);
--------------------------------------------------- ----------
Sales/ 0
Sales/Country=UK/ 0
Sales/Country=UK/year=2024/ 0
Sales/Country=UK/year=2024/data_deptno_10.parquet 2163
Sales/Country=UK/year=2024/data_deptno_20.parquet 2218
Sales/Country=US/ 0
Sales/Country=US/year=2023/ 0
Sales/Country=US/year=2023/data_deptno_30.parquet 2257
Sales/Country=US/year=2025/ 0
Sales/Country=US/year=2025/data_deptno_40.parquet 2414
----------
8
No comments:
Post a Comment