Impact of Direct Reads on Delayed Block Cleanouts

23 July 2009 30 Comments

Written by Christian Antognini

As of Oracle Database 11g, serial full table scans can use direct reads (see Metalink note 793845.1). This is basically done for improving the performance of large full table scans. In my opinion, the most important pros and cons of this feature are the following…

Pros (which are related to the fact that the buffer cache is not used):

Reduced latch utilization
The size of the physical I/Os does not depend on the blocks that are in the buffer cache

Cons:

A segment checkpoint must precede the first direct read
Repeated delayed block cleanouts

In this post, as the title suggests, I would like to focus on the last one. The reason is simple. I cannot remember having already read something about that topic.

Let’s start by showing you an example where delayed block cleanouts are performed…

Create a test table and insert some data into it (note that the table uses 10000 database blocks):

SQL> CREATE TABLE t (id NUMBER, pad VARCHAR2(500)) TABLESPACE test_8k;

SQL> INSERT INTO t
  2  SELECT rownum, rpad('*',500,'*')
  3  FROM dual
  4  CONNECT BY level <= 140000;

SQL> COMMIT;

Generate some blocks that require delayed block cleanout:

SQL> UPDATE t SET id=id*10;

SQL> COMMIT;

Flush the buffer cache (this is necessary because many blocks are in the buffer cache and, therefore, direct reads would not be used for reading the test table):

SQL> ALTER SYSTEM FLUSH BUFFER_CACHE;

Execute two queries on the test table and store the session statistics in a helper table:

SQL> CREATE TABLE s AS
  2  SELECT 0 AS run, name, value
  3  FROM v$statname NATURAL JOIN v$mystat;

SQL> SELECT count(*) FROM t;

  COUNT(*)
----------
    140000

SQL> INSERT INTO s
  2  SELECT 1, name, value
  3  FROM v$statname NATURAL JOIN v$mystat;

SQL> SELECT count(*) FROM t;

  COUNT(*)
----------
    140000

SQL> INSERT INTO s
  2  SELECT 2, name, value
  3  FROM v$statname NATURAL JOIN v$mystat;

Show some statistics for the two executions:

SQL> SELECT *
  2  FROM (
  3    SELECT run, name, value-lag(value) OVER (ORDER BY name, run) AS value
  4    FROM s
  5    WHERE name IN ('cleanouts only - consistent read gets',
  6                   'session logical reads',
  7                   'physical reads',
  8                   'physical reads direct')
  9  )
 10  PIVOT (sum(value) FOR run IN (1 AS RUN_1, 2 AS RUN2_2))
 11  ORDER BY name;

NAME                                          RUN_1     RUN2_2
---------------------------------------- ---------- ----------
cleanouts only - consistent read gets          9466       9399
physical reads                                10112      10004
physical reads direct                         10003      10003
session logical reads                         19850      19538

As you can see from the execution statistics, both runs did almost the same amount of (direct) reads, logical reads and cleanouts. Especially the last is strange. In fact, usually, the cleanouts are performed only once for each block. In this case, however, the same cleanouts are performed for every execution of the query.

This is clearly due to the direct reads.

To confirm that, let’s rerun the test without using serial direct reads. For that purpose, event 10949 is enabled.

SQL> TRUNCATE TABLE s;

SQL> ALTER SESSION SET EVENTS '10949 TRACE NAME CONTEXT FOREVER';

SQL> INSERT INTO s
  2  SELECT 0 AS run, name, value
  3  FROM v$statname NATURAL JOIN v$mystat;

SQL> SELECT count(*) FROM t;

  COUNT(*)
----------
    140000

SQL> INSERT INTO s
  2  SELECT 1, name, value
  3  FROM v$statname NATURAL JOIN v$mystat;

SQL> SELECT count(*) FROM t;

  COUNT(*)
----------
    140000

SQL> INSERT INTO s
  2  SELECT 2, name, value
  3  FROM v$statname NATURAL JOIN v$mystat;

SQL> SELECT *
  2  FROM (
  3    SELECT run, name, value-lag(value) OVER (ORDER BY name, run) AS value
  4    FROM s
  5    WHERE name IN ('cleanouts only - consistent read gets',
  6                   'session logical reads',
  7                   'physical reads',
  8                   'physical reads direct')
  9  )
 10  PIVOT (sum(value) FOR run IN (1 AS RUN_3, 2 AS RUN2_4))
 11  ORDER BY name;

NAME                                          RUN_3     RUN2_4
---------------------------------------- ---------- ----------
cleanouts only - consistent read gets          9399          0
physical reads                                 9919          0
physical reads direct                             0          0
session logical reads                         19592      10228

As you can see from the statistics, the major difference between the second and the third run is the number of direct reads. The interesting thing is to notice how the third run finally did a “regular” cleanout and, therefore, the subsequent run, the fourth one, was able to directly read the (cleaned) blocks. Because of that the number of logical reads of the fourth run was much lower.

11gR1

30 Comments

6 Pings/Trackbacks

Chen Shapira

23 July 2009

Thanks for the great catch and the example code. I suspected something like this is happening, but was very confused since “everyone knows each block is only cleaned once”.

Reply
Greg Rahn

24 July 2009

It’s interesting to note the difference in the metrics, but how would you describe the impact on a database system because of this new behavior? I’m trying to draw the line between difference in metrics and the difference in performance on a database system. What does it mean in terms of performance impact?

Reply
Dion Cho

24 July 2009

Chris. This is a very natural but hard-to-find restriction.

Here we have another reason for selecting the whole table in non-direct scan after big DML!

Reply
Alberto Dell'Era

24 July 2009

Let me check my understanding: direct read means that the block is stored in RAM inside the process’ private memory (not in the buffer cache), and hence any modification to the block are made there and not persisted to disk. Correct ?

I would guess that writing back to disk would be a bit complex; it would necessitate a synchronization with DBWR, checking that you are not writing back an older version of the block, etcetera …

Reply
Christian Antognini

24 July 2009

Hi Greg, Dion

First of all, it is true that such a behavior will probably cause performance problems only in specific cases. But, nevertheless, it’s in my opinion always good to know that such a possibility exist. Because of this problem, I would for example execute a silly query on a table that I know I just modified through a massive DML operation. Do you know, just in case ;-) This is especially true for data that will be “archived” or put in a read-only tablespace.

Now, about the actual performance impact… Of course, it depends! In this specific case, let me show you a couple of figures (for producing them, I used the same queries of the post but with a table storing about 5GB of data).

If the query is I/O bound (no data cached in the Linux buffer cache), I have measured an elapsed time of 51 seconds (8.3 seconds on CPU) with cleanouts and 46 seconds (6.1 seconds on CPU) without them. So, the execution with cleanouts is about 10% slower and uses about 30% more CPU.

If the query is CPU bound (data is cached in the Linux buffer cache), I have measured an elapsed time of 6.1 seconds (100% of the time on CPU) with cleanouts and 4.0 seconds (100% of the time on CPU) without them. So, the execution with cleanouts is about 50% slower and uses about 50% more CPU.

Cheers,
Chris

Reply
Christian Antognini

24 July 2009

Ciao Alberto

> any modification to the block are made there and not persisted to disk. Correct ?

Yes. And, therefore, they have to be repeated every time the block is read.

Sa vedum,
Chris

Reply
Hemant K Chitale

25 July 2009

A delayed block cleanout does require updating the block (and, at some time, writing back to disk). Both operations would be through the SGA (the actual write, I expect, still has to be done by DBWR). A direct read bypassing the SGA cannot be expected (or allowed !) to update the block.
Another non direct read session must “take the burden” of doing the block cleanout.

Reply
Dion Cho

25 July 2009

Chris.

In your testcase, how do you define the exact meaning of “cleanouts only – consistent read gets”?

It seems to me that the cleanout by serial direct path read is not real cleanout, but just to read the undo info to confirm whether the block is dirty or not.

My test case shows that two cases(direct read vs. conventional read after big update) show a big difference in redo size.

[sourcecode language=’sql’]
— serial direct read
redo size 4,156
— convential read
redo size 149,184
[/sourcecode]

If serial direct path read really cleans out the dirty blocks, wouldn’t it show similar redo size?

Reply
Christian Antognini

25 July 2009

Hemant

To avoid misunderstandings, let me stress that I do not expect that a direct read updates the block on disk. Of course it cannot. With this post I just wanted to point out that behavior.

Cheers,
Chris

Reply
Christian Antognini

25 July 2009

Dion

> how do you define the exact meaning of “cleanouts only – consistent read gets”?

I’m not sure to understand your question. Did you ask it because the statistic contains the word “only”? If yes, please notice that there is another statistics named “cleanouts and rollbacks – consistent read gets”. Their definition, according to the Referece Guide, is the following:

cleanouts only – consistent read gets: Number of consistent gets that require only block cleanouts, no rollbacks.
cleanouts and rollbacks – consistent read gets: Number of consistent gets that require both block rollbacks and block cleanouts.

> My test case shows that two cases(direct read vs. conventional read after big update) show a big difference in redo size.

I notice the same behavior. This is probably due to the fact that the block is not really modified. Honestly, I didn’t investigate further in this direction. But it could be interesting to extract the information stored in redo logs…

Cheers,
Chris

Reply
Dion Cho

25 July 2009

The reference guide! What a shame I didn’t check it. :(

The redo generated by serial direct path read seems to be the minimum workload necessary to activate the direct path read – maybe related to object checkpoint. As you commented, redo log will show exact info.

I think that the serial direct path read does not cleanout dirty blocks multiple times(As per your expression, the blocks wouldn’t be really modified multiple times). It would just read additional blocks to check the status of the blocks.

Reply
Christian Antognini

25 July 2009

Dion

> maybe related to object checkpoint

Yep. That should be the reason.

> I think that the serial direct path read does not cleanout dirty blocks multiple times

The block on disk must be cleaned out. So, there is much more than only checking the transaction table in the undo segment. About that point I did several crosschecks between execution statistics, block dumps, event 10203… everything is consistent and leads to the conclusion that a cleanout of every block is be performed. The only exception is for the blocks that are fast cleaned out at commit time.

Cheers,
Chris

Reply
Blogroll Report 17/07/2009 – 24/07/2009 « Coskan’s Approach to Oracle

27 July 2009

[…] Antognini-Impact of Direct Reads on Delayed Block Cleanouts Dion Cho-Disabling direct path read for the serial full table scan – […]
Serial Direct Path Read and Block Cleanout(11g) « Dion Cho – Oracle Performance Storyteller

29 July 2009

[…] Antognini posted an article about the serial direct path read and repeated block cleanout problem here. I’d just like to make it clear what really happens inside […]
Log Buffer #156: a Carnival of the Vanities for DBAs | Pythian Group Blog

31 July 2009

[…] Striving for Optimal Performance, Christian Antognini examines the impact of direct reads on delayed block cleanouts, one of the “most important cons,” in his opinion, of serial full table scans can using […]
Todor Botev

21 August 2009

I suppose if you select ora_rowscn’s in your test runs 1,2,3 they will be different in each run.
In test run 4 they should be the same like in run 3.
Am I right?

Reply
Christian Antognini

21 August 2009

Hi Todor

For all runs the ora_rowscn is the same. Makes sense…

Cheers,
Chris

Reply

Donatello Settembrino

26 November 2009

Hi Christian, I read your interesting test and this connection, I bring
to your attention some doubts on manage of direct reads.

I did two tests, the first on DB version 10g, the other in 11g to try to
understand if the principle is the same for the two versions DB. I know
that a condition for Oracle decides to perform direct reads is that the
number of blocks should be about 5 times the value of hidden parameter
_small_table_threshold.

test 10g


SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Productio
NLSRTL Version 10.2.0.4.0 - Production


NAME                     VALUE      DEFLT      TYPE         DESCRIPTION
------------------------ ---------- ------------------------------------------------------------
_serial_direct_read      FALSE      TRUE       boolean      enable direct read in serial


SQL> select ksppstvl small_tab_threshold, ksppstvl * 5 as block_for_direct_reads
  2  from sys.x_$ksppi x, sys.x_$ksppsv y
  3  where (x.indx = y.indx)
  4  and ksppinm='_small_table_threshold';

SMALL_TAB_THRESHOLD   BLOCK_FOR_DIRECT_READS
--------------------  ----------------------
2130                                  10650

so with a number of blocks greater than 10650, the direct reads should be used

SQL> create table t (x char(1000), y char(1000), z char(1000));

SQL> insert into t select 'x', 'y', 'z' from dual connect by level <= 200000;

SQL> commit;

SQL> alter system flush buffer_cache;


NAME                                          VALUE
---------------------------------------- ----------
physical reads                                  178
physical reads direct                             0

SQL> select count(*) from t;

  COUNT(*)
----------
    200000



NAME                             VALUE
--------------------------- ----------
physical reads                   50626
physical reads direct                0

but this doesn’t happen. For completeness, I also see the number of blocks of the table

SQL> select num_rows, blocks from user_tables where table_name = 'T';

  NUM_ROWS     BLOCKS
---------- ----------
    200000      50437

I run the same test at 11g:


BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production
PL/SQL Release 11.1.0.6.0 - Production
CORE    11.1.0.6.0      Production
TNS for IBM/AIX RISC System/6000: Version 11.1.0.6.0 - Production
NLSRTL Version 11.1.0.6.0 - Production



NAME                     VALUE      DEFLT      TYPE         DESCRIPTION
------------------------ ---------- ------------------------------------------------------------
_serial_direct_read      FALSE      TRUE       boolean      enable direct read in serial



SMALL_TAB_THRESHOLD   BLOCK_FOR_DIRECT_READS
--------------------  ----------------------
1424                                    7120

So, I have the same situation that I had in 10g, but with a different value _small_table_threshold


SQL> create table t (x char(1000), y char(1000), z char(1000));

SQL> insert into t select 'x', 'y', 'z' from dual connect by level <= 200000;

SQL> commit;

SQL> alter system flush buffer_cache;

NAME                          VALUE
------------------------ ----------
physical reads                  156
physical reads direct             0



SQL> select count(*) from t;

  COUNT(*)
----------
    200000


NAME                                VALUE
------------------------------ ----------
physical reads                      50754
physical reads direct               50183

I noticed that in 11g, on equal terms with the test conducted in the version
10g, then with _serial_direct_read -> FALSE and the number of blocks (50437) exceeds 5
times the value of _small_table_threshold (7120) the direct reads are applied
as not done in 10g


SQL> select num_rows, blocks from dba_tables where TABLE_NAME = 'T';

  NUM_ROWS     BLOCKS
---------- ----------
    200000      50437

But then this means that even if Oracle version 11g
_serial_direct_read = FALSE (but the number of blocks is greater than 5 times the value
_small_table_threshold most of the other conditions to which I referred you)
the direct reads are performed?

Is my interpretation correct?

Thanks a lot

Donatello

Christian Antognini

29 November 2009

Ciao Donatello

Yes, your interpretation is correct. Just note that the initialization _serial_direct_read exists from a long time (8i?) and, probably for that reason, it is not (directly) related to the serial direct reads that are used in recent releases.

Cheers,
Chris

Reply

Donatello Settembrino

30 November 2009

Ciao Christian,
I would like to ask you one last question , this time as
regards direct reads and delayed block cleanout, in 10g:


dsettembrino>alter session set "_serial_direct_read" = true;

dsettembrino>create table t (x char(1000), y char(1000), z char(1000));

dsettembrino>insert into t select 'x', 'y', 'z' from dual connect by level <= 200000;

dsettembrino>commit;

I check if there are dirty blocks, as lawful expect:


dsettembrino>select v.dirty, count(*)
  2  from v$bh v
  3  where v.objd = (select object_id from user_objects where object_name = 'T')
  4  group by v.dirty;

D   COUNT(*)
- ----------
Y      21578
N       1582

This means that there are dirty blocks in buffer cache, insert’s blocks are
by definition. I run some statistical checks before executing my query:

dsettembrino>select n.name, m.value
  2  from  v$mystat m, v$statname n
  3  where m.STATISTIC# = n.STATISTIC#
  4  and   n.name in ('redo size', 'cleanouts only - consistent read gets'
  5                    , 'direct path read', 'consistent gets'
  6                    , 'physical reads', 'physical reads'
  7                    ,'physical reads direct', 'undo change vector size');

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
consistent gets                                                       92975
physical reads                                                           51
physical reads direct                                                     0
redo size                                                         634580860
undo change vector size                                             5829132
cleanouts only - consistent read gets                                     0

I run my query:

dsettembrino>select count(*) from t;

  COUNT(*)
----------
    200000

So,I rerun the statistics:

dsettembrino>select n.name, m.value
  2  from  v$mystat m, v$statname n
  3  where m.STATISTIC# = n.STATISTIC#
  4  and   n.name in ('redo size', 'cleanouts only - consistent read gets'
  5                   , 'direct path read', 'consistent gets'
  6                   , 'physical reads', 'physical reads'
  7                   ,'physical reads direct', 'undo change vector size');

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
consistent gets                                                      183930
physical reads                                                        50553
physical reads direct                                                 50118
redo size                                                         634586868
undo change vector size                                             5829132
cleanouts only - consistent read gets                                 40506

My interpretation is the following:

When I run my query (the select count (*) from t;) there are
blocks in buffer cache. I see from the statistic that Oracle
has solved my query using direct reads. To do that, Oracle
must make changes to blocks, to download it from SGA and then
bring these blocks into PGA, since the engine Oracle has
established that the circumstances require the use of direct reads.

1)Is it correct what I say?

2)If my hypothesis is correct, it means that during the query
(select count (*) from t) were performed delayed block cleanout
for a number of blocks (the ones on disk) + fast commit to a number
blocks (those who were SGA) before the engine execute direct reads?

Kind Regards,
Donatello

Christian Antognini

1 December 2009

Ciao Donatello

I’m not sure to fully understand what do you mean… This is probably because you are confused by the dirty blocks in the SGA. The important thing you have to remember is that before doing direct reads a segment checkpoint has to be performed. Because of the segment checkpoint, all blocks are on disk and, therefore, every time that they are read through a direct read a deferred cleanout has to be performed.

HTH
Chris

Reply
Donatello Settembrino

2 December 2009
Ciao Christian,

I have explained myself badly, I understood from what you had written that
a key prerequisite is that a checkpoint has been executed,
but what I meant is that after I insert (ie before the
select count (*)…) saw a number bi dirty blocks in SGA
```
D   COUNT(*)
- ----------
Y      21578
N       1582
```
and this means that (the image of those blocks does not match that on disk
also because some blocks there have never been on disk, right?).
I’d like to have only confirmed the following concept:

when I run “select count (*) from t” and I see that Oracle uses
“Direct reads” …
```
NAME                        VALUE
--------------------------- ----------
physical reads                      51
physical reads direct                0

select count(*) from t;

NAME                        VALUE
--------------------------- ----------
physical reads                   50553
physical reads direct            50118
```
… this means that Oracle has done in the meantime
a checkpoint and then brought up the disc, even blocks that
were in SGA? If not, I would not expect “direct reads”
but “conventional reads.

Thanks
Reply
- Christian Antognini
  
  11 December 2009
  
  Ciao Donatello
  
  > what I meant is that after I insert (ie before the
  > select count (*)…) saw a number dirty blocks in SGA
  
  > the image of those blocks does not match that on disk
  > also because some blocks there have never been on disk, right?
  
  Yes, a regular insert writes into the buffer cache only.
  
  > when I run “select count (*) from t” and I see that Oracle uses
  > “Direct reads” …
  >
  > … this means that Oracle has done in the meantime
  > a checkpoint and then brought up the disc, even blocks that
  > were in SGA?
  
  Yes.
  
  And now I understand what bothers you :-) It is effectively so that the DBWR has to write some blocks to disk and just after that the server process has to read them back.
  
  HTH
  Chris
  
  Reply
True or False – Direct Path Reads « Charles Hooper's Oracle Notes

21 April 2010

[…] http://antognini.ch/2009/07/impact-of-direct-reads-on-delayed-block-cleanouts/ (dated July 23, 2009) […]
אתגר החודש – אתגר מאי 2011 - Israel Database Portal

1 June 2011

[…] http://antognini.ch/2009/07/impact-of-direct-reads-on-delayed-block-cleanouts/ (הניסוי מוצג בהערה מספר 18 של הפוסט בלינק לעיל.) […]
ksun

2 September 2011
Ciao Chris,

I try to think the impact on real applications, thus extract a testcase:
```
create table t2 as select rownum id from dual connect by level <= 100;

select /*+ leading(t2) use_nl(t) */ count(*) from t, t2;
```
it seems that "physical reads direct" of table t repeats 100 times (100 * 10003)
due to the 100 nested loops on t.
Reply
Christian Antognini

5 September 2011

Hi Ksun

I would not expect something different in such a case…

Cheers,
Chris

Reply
Jan

23 October 2012

I have found the “direct path read” have several times greater response time than the „db file scattered read“. Do you think that the time spent on theese overheads mentioned above (repeated block cleanout, segment checkpoint) are also counted into the response time of “direct path read”? If not, then what can be the reason that “direct path read” are slower? MBRC should be same for both.

Jan

Reply
- Christian Antognini
  
  23 October 2012
  
  Hi Jan
  
  > Do you think that the time spent on these overheads mentioned above are also counted into the response time of “direct path read”?
  
  No. You should see more disk I/O and CPU utilization for them.
  
  > If not, then what can be the reason that “direct path read” are slower? MBRC should be same for both.
  
  They are not implemented in the same way. E.g. the system calls (depending on port and release) might be different. In addition, the size of the disk I/O can also be different. E.g. a FTS based on scattered reads does not reads blocks already in the buffer cache. So, I would suggest to carefully compare the two kind of disk I/O…
  
  HTH
  Chris
  
  Reply
Understanding enhancements to block cleanouts in Exadata part 2 « Martins Blog

25 February 2015

[…] This means that I am almost certainly going to see direct path reads now when scanning my table. As Christian Antognini has pointed out direct path reads have an impact on the amount of work that has to be done. The test commences with […]

Blogroll Report 17/07/2009 – 24/07/2009 « Coskan’s Approach to Oracle on 27 July 2009 at 15:28
Serial Direct Path Read and Block Cleanout(11g) « Dion Cho – Oracle Performance Storyteller on 29 July 2009 at 05:15
Log Buffer #156: a Carnival of the Vanities for DBAs | Pythian Group Blog on 31 July 2009 at 23:32
True or False – Direct Path Reads « Charles Hooper's Oracle Notes on 21 April 2010 at 07:03
אתגר החודש – אתגר מאי 2011 - Israel Database Portal on 1 June 2011 at 13:28
Understanding enhancements to block cleanouts in Exadata part 2 « Martins Blog on 25 February 2015 at 14:34

Impact of Direct Reads on Delayed Block Cleanouts

30 Comments

6 Pings/Trackbacks

Leave a Reply Cancel reply