THE RAJESH BLOG: GUID or Sequence

Yesterday got a question on using GUID as primary keys in Oracle database than sequences.

The main difference lies in the fact about how the data is organized in the table and indexes.

GUID’s are

Randomly unique values
Will insert randomly across the entire breadth of the index.
will need the entire index in memory in order to say “we do no physical reads on the index during an insert”
These values are NOT monotonically increasing values
They have a terrific impact on the clustering factor of the index (close to the number of rows - so range scans are not likely possible).
Index will be larger in size when compare to sequence based index, will end up with 50-50 index splits ( half of the index will be empty at any time – that’s ok – don’t rebuild them to reclaim the space, that will only be a waste of time)

Sequences are

Sequentially unique values
As you insert – the values are increasing, you only hit the right hand size of the index.
Only the right hand side of the index in the cache, the left hand side – which you don’t query so much anymore since it is old data – it is rarely in the cache, hence when we insert we into this index – you will do very little if any physical IO.
These values are monotonically increasing values ( as you insert into it, the values are increasing and hit the right hand side of index)
They will end up with an appealing clustering factor of the index (close to the number of blocks – so range scans are likely possible)
Index will be thin, will end up with 90-10 split ups.

Here is my demo to show the impact of using GUID as primary keys (modeled using a classical OLTP case – where each transaction is supposed to insert only a couple/few rows ).

demo@ORA12C> create table t1(x raw(16) constraint t1_pk primary key, data varchar2(100));

Table created.

demo@ORA12C> create table t2(x number constraint t2_pk primary key, data varchar2(100));

Table created.

demo@ORA12C>

demo@ORA12C> create sequence s1 cache 1000;

Sequence created.

demo@ORA12C> variable n number

demo@ORA12C> exec :n := 100000;

PL/SQL procedure successfully completed.

demo@ORA12C> exec runstats_pkg.rs_start;

PL/SQL procedure successfully completed.

demo@ORA12C> begin

2 for i in 1..:n

3 loop

4 insert into t1(x,data)

5 values(sys_guid(),'x');

6 commit;

7 end loop;

8 end;

9 /

PL/SQL procedure successfully completed.

demo@ORA12C> exec runstats_pkg.rs_middle;

PL/SQL procedure successfully completed.

demo@ORA12C> begin

2 for i in 1..:n

3 loop

4 insert into t2(x,data)

5 values(s1.nextval,'x');

6 commit;

7 end loop;

8 end;

9 /

PL/SQL procedure successfully completed.

demo@ORA12C> exec runstats_pkg.rs_stop(1000);

Run1 ran in 2373 hsecs and CPU time in 1320 hsecs

Run2 ran in 1808 hsecs and CPU time in 889 hsecs

run 1 ran in 131.25% of the time

Name Run1 Run2 Diff

STAT...commit cleanouts 201,485 200,479 -1,006

STAT...commit cleanouts succes 201,481 200,475 -1,006

STAT...immediate (CURRENT) blo 1,489 374 -1,115

STAT...deferred (CURRENT) bloc 148,796 150,004 1,208

STAT...physical reads for flas 4,890 3,673 -1,217

STAT...physical reads cache 4,895 3,678 -1,217

STAT...physical reads 4,895 3,678 -1,217

STAT...physical read total IO 4,895 3,678 -1,217

STAT...physical read IO reques 4,895 3,678 -1,217

STAT...free buffer requested 4,925 3,707 -1,218

STAT...non-idle wait count 4,921 3,698 -1,223

STAT...db block gets from cach 102,656 101,391 -1,265

LATCH.row cache objects 2,245 3,594 1,349

LATCH.undo global data 306,373 304,738 -1,635

LATCH.checkpoint queue latch 7,490 9,323 1,833

LATCH.object queue header oper 20,925 18,964 -1,961

STAT...redo entries 308,755 303,767 -4,988

STAT...db block gets from cach 520,397 512,708 -7,689

STAT...db block gets 520,397 512,708 -7,689

STAT...db block changes 613,707 605,896 -7,811

STAT...session logical reads 523,017 514,618 -8,399

LATCH.Consistent RBA 32,396 21,652 -10,744

STAT...messages sent 32,844 21,909 -10,935

LATCH.lgwr LWN SCN 32,935 21,956 -10,979

LATCH.messages 72,350 46,897 -25,453

LATCH.cache buffers chains 2,123,816 2,092,403 -31,413

LATCH.redo writing 97,361 65,210 -32,151

LATCH.redo allocation 98,424 65,823 -32,601

STAT...session pga memory 65,536 0 -65,536

STAT...session pga memory max 65,536 0 -65,536

STAT...session uga memory max 157,672 65,488 -92,184

STAT...calls to get snapshot s 100,564 200,452 99,888

STAT...file io wait time 9,408,791 9,213,081 -195,710

LATCH.sequence cache 0 300,206 300,206

STAT...KTFB alloc space (block 23,986,176 21,889,024 -2,097,152

STAT...undo change vector size 27,851,092 24,010,264 -3,840,828

STAT...redo size 97,029,204 87,294,744 -9,734,460

STAT...physical read bytes 40,099,840 30,130,176 -9,969,664

STAT...physical read total byt 40,099,840 30,130,176 -9,969,664

STAT...cell physical IO interc 40,099,840 30,130,176 -9,969,664

STAT...logical read bytes from######################## -68,804,608

Run1 latches total versus runs -- difference and pct

Run1 Run2 Diff Pct

3,748,534 3,905,655 157,121 95.98%

PL/SQL procedure successfully completed.

demo@ORA12C>

With sequence based primary keys, we could see 24% improvements on elapsed timings and 44% improvements on CPU timings.

The most important thing to look over with the statistics is the physical IO that got reduced to 1/3 with sequence based primary keys (highlighted in red above) in place.

demo@ORA12C> column index_name format a10

demo@ORA12C> select index_name, clustering_factor,t.num_rows,t.blocks

2 from user_indexes i,

3 user_tables t

4 where i.table_name in ('T1','T2')

5 and i.table_name = t.table_name

6 order by index_name ;

INDEX_NAME CLUSTERING_FACTOR NUM_ROWS BLOCKS

---------- ----------------- ---------- ----------

T1_PK 99741 100000 370

T2_PK 164 100000 244

2 rows selected.

demo@ORA12C>

The index T1_PK has the clustering factor close the number of rows – means the data is randomly scattered across the table, with organized to index, so the changes of using the index to answer the queries is likely less.

demo@ORA12C> column min_x new_val min_x

demo@ORA12C> column max_x new_val max_x

demo@ORA12C> select min(x) min_x,max(x) max_x

2 from ( select x from t1

3 order by x

4 fetch first 100 row only );

MIN_X MAX_X

-------------------------------- --------------------------------

00057F09AAAD444DAAB3C086EC8AC754 0038DFA03B12479CB8A8638E7F96FFC2

1 row selected.

demo@ORA12C> set autotrace traceonly explain

demo@ORA12C> select * from t1 where x between

2 hextoraw('&min_x') and hextoraw('&max_x');

old 2: hextoraw('&min_x') and hextoraw('&max_x')

new 2: hextoraw('00057F09AAAD444DAAB3C086EC8AC754') and hextoraw('0038DFA03B12479CB8A8638E7F96FFC2')

Execution Plan

----------------------------------------------------------

Plan hash value: 3617692013

--------------------------------------------------------------------------

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 80 | 1520 | 79 (4)| 00:00:01 |

|* 1 | TABLE ACCESS FULL| T1 | 80 | 1520 | 79 (4)| 00:00:01 |

--------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("X"<=HEXTORAW('0038DFA03B12479CB8A8638E7F96FFC2') AND

"X">=HEXTORAW('00057F09AAAD444DAAB3C086EC8AC754'))

The index T2_PK has the clustering factor close the number of blocks – means the data in the table is organized to index, so the changes of using the index to answer the queries is more possible.

demo@ORA12C> select min(x) min_x,max(x) max_x

2 from ( select x from t2

3 order by x

4 fetch first 100 row only );

MIN_X MAX_X

---------- ----------

200001 200100

1 row selected.

demo@ORA12C> set autotrace traceonly explain

demo@ORA12C> select * from t2 where x between &min_x and &max_x;

old 1: select * from t2 where x between &min_x and &max_x

new 1: select * from t2 where x between 200001 and 200100

Execution Plan

----------------------------------------------------------

Plan hash value: 1879001825

----------------------------------------------------------------------------

----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 7 | 3|

| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T2 | 1 | 7 | 3|

|* 2 | INDEX RANGE SCAN | T2_PK | 1 | | 2|

----------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("X">=200001 AND "X"<=200100))

Now comparing the size of the index, GUID based indexes took 2x space when compare to Sequence based indexes, (remember bigger the size means, more disk space, more IO and more latches and less scalable)

demo@ORA12C> analyze index t1_pk validate structure;

Index analyzed.

demo@ORA12C> select name, blocks,lf_blks,btree_space,used_space,pct_used from index_stats;

NAME BLOCKS LF_BLKS BTREE_SPACE USED_SPACE PCT_USED

---------- ---------- ---------- ----------- ---------- ----------

T1_PK 640 504 4040032 2704473 67

1 row selected.

demo@ORA12C> exec show_space('T1_PK',user,'INDEX');

Unformatted Blocks ..................... 110

FS1 Blocks (0-25) ..................... 0

FS2 Blocks (25-50) ..................... 15

FS3 Blocks (50-75) ..................... 0

FS4 Blocks (75-100)..................... 0

Full Blocks ..................... 497

Total Blocks............................ 640

Total Bytes............................. 5,242,880

Total MBytes............................ 5

Unused Blocks........................... 0

Unused Bytes............................ 0

Last Used Ext FileId.................... 8

Last Used Ext BlockId................... 22,656

Last Used Block......................... 128

PL/SQL procedure successfully completed.

Totally there were about 640 blocks almost 33% free spaces available in all blocks, since the GUID based inserts were spread across the breath (due to its random nature).

demo@ORA12C> analyze index t2_pk validate structure;

Index analyzed.

demo@ORA12C> select name, blocks,lf_blks,btree_space,used_space,pct_used from index_stats;

NAME BLOCKS LF_BLKS BTREE_SPACE USED_SPACE PCT_USED

---------- ---------- ---------- ----------- ---------- ----------

T2_PK 256 187 1504032 1490919 100

1 row selected.

demo@ORA12C> exec show_space('T2_PK',user,'INDEX');

Unformatted Blocks ..................... 32

FS1 Blocks (0-25) ..................... 0

FS2 Blocks (25-50) ..................... 25

FS3 Blocks (50-75) ..................... 0

FS4 Blocks (75-100)..................... 0

Full Blocks ..................... 187

Total Blocks............................ 256

Total Bytes............................. 2,097,152

Total MBytes............................ 2

Unused Blocks........................... 0

Unused Bytes............................ 0

Last Used Ext FileId.................... 8

Last Used Ext BlockId................... 23,040

Last Used Block......................... 128

PL/SQL procedure successfully completed.

Almost 50% reduction on the disk space, with sequence based indexes in place, and each block is well utilized (almost 100% utilized – no free space in them) before moving into the next leaf block.

All these factor into less logical IO when using sequence based primary keys in place.

demo@ORA12C> set autotrace traceonly explain statistics

demo@ORA12C> select x from t1;

100000 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 3617692013

--------------------------------------------------------------------------

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 100K| 1660K| 78 (3)| 00:00:01 |

| 1 | TABLE ACCESS FULL| T1 | 100K| 1660K| 78 (3)| 00:00:01 |

--------------------------------------------------------------------------

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

1041 consistent gets

0 physical reads

0 redo size

1930381 bytes sent via SQL*Net to client

7878 bytes received via SQL*Net from client

668 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

100000 rows processed

So why do we get a full Table scan rather than index fast full scan (since we access only a single column that is the primary key of the table)? , when we compare the size of the table and index, table look smaller than the index (since it holds the key value + row id in place for each row in the leaf block).

demo@ORA12C> exec show_space('T1_PK',user,'INDEX');

Unformatted Blocks ..................... 110

FS1 Blocks (0-25) ..................... 0

FS2 Blocks (25-50) ..................... 15

FS3 Blocks (50-75) ..................... 0

FS4 Blocks (75-100)..................... 0

Full Blocks ..................... 497

Total Blocks............................ 640

Total Bytes............................. 5,242,880

Total MBytes............................ 5

Unused Blocks........................... 0

Unused Bytes............................ 0

Last Used Ext FileId.................... 8

Last Used Ext BlockId................... 22,656

Last Used Block......................... 128

PL/SQL procedure successfully completed.

demo@ORA12C> exec show_space('T1',user,'TABLE');

Unformatted Blocks ..................... 0

FS1 Blocks (0-25) ..................... 0

FS2 Blocks (25-50) ..................... 0

FS3 Blocks (50-75) ..................... 0

FS4 Blocks (75-100)..................... 39

Full Blocks ..................... 331

Total Blocks............................ 384

Total Bytes............................. 3,145,728

Total MBytes............................ 3

Unused Blocks........................... 0

Unused Bytes............................ 0

Last Used Ext FileId.................... 8

Last Used Ext BlockId................... 21,504

Last Used Block......................... 128

PL/SQL procedure successfully completed.

demo@ORA12C> select x from t2;

100000 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 871972594

------------------------------------------------------------------------------

------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 100K| 488K| 41 (5)| 00:00:01 |

| 1 | INDEX FAST FULL SCAN| T2_PK | 100K| 488K| 41 (5)| 00:00:01 |

------------------------------------------------------------------------------

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

892 consistent gets

0 physical reads

0 redo size

729371 bytes sent via SQL*Net to client

7878 bytes received via SQL*Net from client

668 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

100000 rows processed

demo@ORA12C> set autotrace off

By all means, sequence based primary keys looks very appealing (less cpu, less IO, less diskspace, 100% PCT on leaf blocks, good clustering factor) and highly scalable than GUID based primary keys.

Friday, December 2, 2016

GUID or Sequence

1 comment: