Friday, August 7, 2020

HCC Support for row level locking

Most relational databases stores data in row major format, each row has multiple columns, and Oracle accessing these columns by reading a row, locating a column, reading a value (if it exists) and display to end users, these ROWID / Index based access are very efficient for most general purpose and OLTP query engines. On the other hand if we want a single column from the table – perhaps to perform an aggregation, we incur significant overhead, wider the table greater the overhead.

 

Column databases engine operate on column rather than rows. It stores the value for the column co-located, potentially with lots of optimization already included in the way it stores the column. This is likely to make the column based access fast. Instead of having to read the whole row to extract just the value of a single column, the engine can iterate over a large block and retrieve many values in multi block operation, columnar databases are geared towards analytic or read mostly workloads.

 

Oracle Hybrid columnar compression (HCC) combines the advantage of columnar data organization in that it stores columns separately within a new storage type, called compression-unit or CU. But unlike the pure columnar databases it doesnot neglect the “ROWID based access” the CU is written contiguously to the disk in the form of multiple standard Oracle blocks. Information pertaining to a given row is within the same CU, allowing Oracle to blindly issue one or two read request matching the size of CU and be sure that the row information has been retrieved.

 

Updating a single row in HCC compressed table locks the entire CU containing the row. This can cause extreme contention issue for OLTP type systems, locking the entire CU is the main reason that HCC is not recommended for tables/partitions where data is actively updated.

 

demo@PDB1> create table stage_qh

  2  compress for query high

  3  as

  4  select * from big_table

  5  where rownum <=1000000;

 

So we got a table HCC compressed for query high and size of the table is around 10MB and got 1171 blocks filled completely.

 

demo@PDB1> exec show_space('STAGE_QH');

Unformatted Blocks .....................               0

FS1 Blocks (0-25) ......................               0

FS2 Blocks (25-50) .....................               0

FS3 Blocks (50-75) .....................               0

FS4 Blocks (75-100).....................               0

Full Blocks ............................           1,171

Total Blocks............................           1,280

Total Bytes.............................      10,485,760

Total MBytes............................              10

Unused Blocks...........................              81

Unused Bytes............................         663,552

Last Used Ext FileId....................              24

Last Used Ext BlockId...................      16,786,944

Last Used Block.........................              47

 

Now looking at the data representation

 

demo@PDB1> with function rowid_info ( p_rid in rowid )

  2  return varchar2 as

  3      l_data varchar2(80);

  4      l_rowid_type number;

  5      l_object_number number;

  6      l_relative_fno number;

  7      l_block_number number;

  8      l_row_number number;

  9  begin

10      dbms_rowid.rowid_info(

11          rowid_in => p_rid ,

12          rowid_type => l_rowid_type,

13          object_number => l_object_number,

14          relative_fno => l_relative_fno,

15          block_number =>l_block_number ,

16          row_number =>l_row_number );

17      return (l_relative_fno||'.'||l_block_number||'.'||l_row_number);

18  end;

19  select id, rowid_info(rowid) x from stage_qh where rownum <= 10;

20  /

 

        ID X

---------- ------------------------------

      2001 4.8563.0

      2002 4.8563.1

      2003 4.8563.2

      2004 4.8563.3

      2005 4.8563.4

      2006 4.8563.5

      2007 4.8563.6

      2008 4.8563.7

      2009 4.8563.8

      2010 4.8563.9

 

10 rows selected.

 

We got the values 2001 to 2010 all present in the same block (i.e, 8563),

 

demo@PDB1> update stage_qh set object_name = 'xx' where id = 2001;

 

1 row updated.

 

After updating one row, we lock the entire CU- which means no other transaction cannot perform any changes to other rows in same CU. Doing so we will end up with an error like this

 

demo@PDB1> declare

  2      pragma autonomous_transaction;

  3  begin

  4      update stage_qh set owner ='xx'

  5      where id = 2002;

  6      rollback;

  7  end;

  8  /

declare

*

ERROR at line 1:

ORA-00060: deadlock detected while waiting for resource

ORA-06512: at line 4

 

However this got changed in 12c and above, the support for Row level locking was added in HCC. Which means we can set some space in CU header aside for tracking DML’s, to enable this feature we need add a new syntax for HCC

 

demo@PDB1> create table stage_qh_rll

  2  compress for query high row level locking

  3  as

  4  select * from stage_qh;

 

Table created.

 

demo@PDB1> select table_name,compression,compress_for

  2  from user_tables

  3  where table_name like 'STAGE%';

 

TABLE_NAME      COMPRESS COMPRESS_FOR

--------------- -------- ------------------------------

STAGE_QH        ENABLED  QUERY HIGH

STAGE_QH_RLL    ENABLED  QUERY HIGH ROW LEVEL LOCKING

 

demo@PDB1> exec show_space('STAGE_QH_RLL');

Unformatted Blocks .....................               0

FS1 Blocks (0-25) ......................               0

FS2 Blocks (25-50) .....................               0

FS3 Blocks (50-75) .....................               0

FS4 Blocks (75-100).....................               0

Full Blocks ............................           1,343

Total Blocks............................           1,408

Total Bytes.............................      11,534,336

Total MBytes............................              11

Unused Blocks...........................              35

Unused Bytes............................         286,720

Last Used Ext FileId....................              24

Last Used Ext BlockId...................      16,788,352

Last Used Block.........................              93

 

First of all we can see that “Row level locking” has been requested and applied to the table, when we compare the size of the tables we can notice the extra space added to CU header - got increased to 11MB and the number of full blocks were increased from 1171 to 1343 – to keep track of all DML operations.

 

Looking at the data representation it was like this

 

demo@PDB1> with function rowid_info ( p_rid in rowid )

  2  return varchar2 as

  3      l_data varchar2(80);

  4      l_rowid_type number;

  5      l_object_number number;

  6      l_relative_fno number;

  7      l_block_number number;

  8      l_row_number number;

  9  begin

10      dbms_rowid.rowid_info(

11          rowid_in => p_rid ,

12          rowid_type => l_rowid_type,

13         object_number => l_object_number,

14          relative_fno => l_relative_fno,

15          block_number =>l_block_number ,

16          row_number =>l_row_number );

17      return (l_relative_fno||'.'||l_block_number||'.'||l_row_number);

18  end;

19  select id, rowid_info(rowid) x from stage_qh_rll where rownum <= 10;

20  /

 

        ID X

---------- ------------------------------

      2001 4.8691.0

      2002 4.8691.1

      2003 4.8691.2

      2004 4.8691.3

      2005 4.8691.4

      2006 4.8691.5

      2007 4.8691.6

      2008 4.8691.7

      2009 4.8691.8

      2010 4.8691.9

 

10 rows selected

 

All the rows from 2001 to 2010 are present it the same block (8691), in the next step a transaction updated a single row in our new table

 

demo@PDB1> update stage_qh_rll set object_name = 'xx' where id = 2001;

 

1 row updated.

 

With “row lock enabled” on this table, it won’t lock the entire CU – doing another transaction updating a different row with in the same CU doesnot produce any kind of “dead lock” issue like above.

 

demo@PDB1> declare

  2      pragma autonomous_transaction;

  3  begin

  4      update stage_qh_rll set owner ='xx'

  5      where id = 2002;

  6      rollback;

  7  end;

  8  /

 

PL/SQL procedure successfully completed.

 

 

No waits and no locks in this case.

Thursday, July 30, 2020

Hybrid Columnar Compression in 12.2

Oracle 12.2 introduced an interesting optimization for Hybrid columnar compression (HCC).  Until 12.2 we have use direct path load into HCC compressed segments for the data to be actually compressed, if we don’t use a direct path load it will still succeed in entering data into the segment, however this newly inserted data was not HCC compressed and there was no error message or warning about that.
 
Here is a test case from Oracle 11g database running on Exadata X7
 
demo@ORA11G> select * from v$version;
 
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g EE Extreme Perf Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
CORE    11.2.0.4.0      Production
TNS for Linux: Version 11.2.0.4.0 - Production
NLSRTL Version 11.2.0.4.0 - Production
 
demo@ORA11G> create table t
  2  nologging
  3  compress for query high
  4  as
  5  select *
  6  from all_objects
  7  where 1 = 0 ;
 
Table created.
 
First we will start with a direct path load.
 
demo@ORA11G> insert /*+ append */ into t select * from all_objects where rownum <=10000;
 
10000 rows created.
 
demo@ORA11G> commit;
 
Commit complete.
 
So how do we verify the data loaded as part of the above dml, was compressed? Thankfully Oracle is well instrumented, and even more thankfully that instrumentation was not limited to performance. The Oracle provided a package dbms_compression features a method “get_compression_type”, which allow us to pass a ROWID and it will tell us the type of compression applied on that row.
 
demo@ORA11G> select dbms_compression.get_compression_type(user,'T',rowid) comp_type, count(*)
  2  from t
  3  group by dbms_compression.get_compression_type(user,'T',rowid);
 
COMP_TYPE   COUNT(*)
---------- ----------
         4      10000
 
Compression type =4 means HCC query high, all rows loaded using direct path were compressed based on the compression type set at the segment.
 
Now let’s repeat the test case without direct path (removing append hint) and see the type of compression getting applied.
 
demo@ORA11G> truncate table t;
 
Table truncated.
 
demo@ORA11G> insert into t select * from all_objects where rownum <=10000;
 
10000 rows created.
 
demo@ORA11G> commit;
 
Commit complete.
 
demo@ORA11G> select dbms_compression.get_compression_type(user,'T',rowid) comp_type, count(*)
  2  from t
  3  group by dbms_compression.get_compression_type(user,'T',rowid);
 
COMP_TYPE   COUNT(*)
---------- ----------
         1       8994
        64       1006
 
Compression type =1 means no compression, this means during convention path load, no compression is applied to the newly inserted data, therefore to have it effectively compressed all the load should be done using direct path load.
 
Repeating the same in Oracle 12c (12.2), we see this.
 
c##rajesh@PDB1> insert into t select * from all_objects;
 
82161 rows created.
 
c##rajesh@PDB1> commit;
 
Commit complete.
 
c##rajesh@PDB1> select dbms_compression.get_compression_type(user,'T',rowid) comp_type, count(*)
  2  from t
  3  group by dbms_compression.get_compression_type(user,'T',rowid)
  4  /
 
COMP_TYPE   COUNT(*)
---------- ----------
         4      82161
 
Though we did conventional path laod, all rows were effectively compressed. So there is no more need for the append hint for insert-select in 12.2? This would be great news, and should alleviate some issues people have from unknowingly using HCC without direct path inserts. And it’s good for concurrency (because direct path operations require a segment lock, whereas array inserts do not use a segment lock)! So array inserts into HCC segments can compress data in Oracle 12.2 even if we don’t specify APPEND hint.
 

Wednesday, July 22, 2020

Storage Indexes - Part VIII

Exadata storage indexes depends on smart scan, which in-turn depend on direct path reads (either serial or parallel). However Oracle will generally use serial direct path reads for large objects, but when the objects are partitioned, Oracle may fail to recognize that the object is “large” while accessing the individual partitions on the table, because Oracle look at the size of each individual segments.  This might result in some partitions not being read via direct path read and hence no smart scan mechanism thus disabling any storage indexes for that partition.
 
The same goes with compression in place, when the data is being compressed the reduced size of the compressed segments will be even less likely to trigger the serial direct path reads and the problem becomes even more noticeable.
 
Here is the table that sized 1400MB in size
 
c##rajesh@PDB1> create table t
  2  nologging
  3  as
  4  select *
  5  from big_table;
 
Table created.
 
c##rajesh@PDB1> exec show_space('T');
Unformatted Blocks .....................               0
FS1 Blocks (0-25) ......................               0
FS2 Blocks (25-50) .....................               0
FS3 Blocks (50-75) .....................               0
FS4 Blocks (75-100).....................               0
Full Blocks ............................         182,648
Total Blocks............................         188,416
Total Bytes.............................   1,543,503,872
Total MBytes............................           1,472
Unused Blocks...........................           5,100
Unused Bytes............................      41,779,200
Last Used Ext FileId....................              24
Last Used Ext BlockId...................      18,268,160
Last Used Block.........................           3,092
 
PL/SQL procedure successfully completed.
 
Here is the script that was used for execution.
 
c##rajesh@PDB1> $ type script.sql
set termout off
select * from t where owner ='JYU';
set termout on
 
c##rajesh@PDB1> select s.name,m.value
  2  from v$statname s ,
  3      v$mystat m
  4  where s.statistic# = m.statistic#
  5  and s.name in ('cell physical IO bytes saved by storage index',
  6  'cell physical IO interconnect bytes returned by smart scan' );
 
NAME                                                                    VALUE
------------------------------------------------------------ ----------------
cell physical IO bytes saved by storage index                               0
cell physical IO interconnect bytes returned by smart scan                  0
 
We make repeated execution to warmup the storage cells to build the storage index and see if that adds benefit to the execution.
 
c##rajesh@PDB1> @script.sql
c##rajesh@PDB1> @script.sql
c##rajesh@PDB1> @script.sql
c##rajesh@PDB1> @script.sql
c##rajesh@PDB1> select s.name,m.value
  2  from v$statname s ,
  3      v$mystat m
  4  where s.statistic# = m.statistic#
  5  and s.name in ('cell physical IO bytes saved by storage index',
  6  'cell physical IO interconnect bytes returned by smart scan' );
 
NAME                                                                    VALUE
------------------------------------------------------------ ----------------
cell physical IO bytes saved by storage index                      3842637824
cell physical IO interconnect bytes returned by smart scan            5940000
 
the storage index is helping us here, that got saved nearly 3+GB of data being transferred from storage to database layer, and the amount of the data that returned from storage to database layer in this case was just as few as 5MB.
 
Now let’s see the effect of compression place
 
c##rajesh@PDB1> alter table t compress for archive high;
 
Table altered.
 
c##rajesh@PDB1> alter table t move online parallel 8;
 
Table altered.
 
c##rajesh@PDB1> exec show_space('T');
Unformatted Blocks .....................               0
FS1 Blocks (0-25) ......................               0
FS2 Blocks (25-50) .....................               0
FS3 Blocks (50-75) .....................               0
FS4 Blocks (75-100).....................               4
Full Blocks ............................           9,621
Total Blocks............................           9,752
Total Bytes.............................      79,888,384
Total MBytes............................              76
Unused Blocks...........................               0
Unused Bytes............................               0
Last Used Ext FileId....................              24
Last Used Ext BlockId...................      23,909,120
Last Used Block.........................             536
 
PL/SQL procedure successfully completed.
 
c##rajesh@PDB1> exec dbms_stats.gather_table_stats(user,'T',degree=>4,no_invalidate=>false);
 
PL/SQL procedure successfully completed.
 
The effect of compression has reduced the size from 1400+ MB to just 76 MB.  Let’s run the queries post the compression
 
c##rajesh@PDB1> select s.name,m.value
  2  from v$statname s ,
  3      v$mystat m
  4  where s.statistic# = m.statistic#
  5  and s.name in ('cell physical IO bytes saved by storage index',
  6  'cell physical IO interconnect bytes returned by smart scan' );
 
NAME                                                                    VALUE
------------------------------------------------------------ ----------------
cell physical IO bytes saved by storage index                               0
cell physical IO interconnect bytes returned by smart scan                  0
 
c##rajesh@PDB1> @script.sql
c##rajesh@PDB1> @script.sql
c##rajesh@PDB1> @script.sql
c##rajesh@PDB1> @script.sql
c##rajesh@PDB1> @script.sql
c##rajesh@PDB1> select s.name,m.value
  2  from v$statname s ,
  3      v$mystat m
  4  where s.statistic# = m.statistic#
  5  and s.name in ('cell physical IO bytes saved by storage index',
  6  'cell physical IO interconnect bytes returned by smart scan' );
 
NAME                                                                    VALUE
------------------------------------------------------------ ----------------
cell physical IO bytes saved by storage index                               0
cell physical IO interconnect bytes returned by smart scan            1894136
 
c##rajesh@PDB1>
 
No matter how often we run this queries now, no storage index is used. This confirms that storage index will not be in use for smaller segments. Ofcourse for smaller segments that sounds reasonable and for larger segments that favor direct path reads storage indexes plays a major role in eliminating the portion of region where the requested data can’t exist.