Jul 23 2009

Impact of Direct Reads on Delayed Block Cleanouts

Tag: 11gR1Christian Antognini @ 3:58 pm

As of Oracle Database 11g, serial full table scans can use direct reads (see Metalink note 793845.1). This is basically done for improving the performance of large full table scans. In my opinion, the most important pros and cons of this feature are the following…

Pros (which are related to the fact that the buffer cache is not used):

  • Reduced latch utilization
  • The size of the physical I/Os does not depend on the blocks that are in the buffer cache

Cons:

  • A segment checkpoint must precede the first direct read
  • Repeated delayed block cleanouts

In this post, as the title suggests, I would like to focus on the last one. The reason is simple. I cannot remember having already read something about that topic.

Let’s start by showing you an example where delayed block cleanouts are performed…

  • Create a test table and insert some data into it (note that the table uses 10000 database blocks):

SQL> CREATE TABLE t (id NUMBER, pad VARCHAR2(500)) TABLESPACE test_8k;

SQL> INSERT INTO t
  2  SELECT rownum, rpad('*',500,'*')
  3  FROM dual
  4  CONNECT BY level <= 140000;

SQL> COMMIT;

  • Generate some blocks that require delayed block cleanout:

SQL> UPDATE t SET id=id*10;

SQL> COMMIT;

  • Flush the buffer cache (this is necessary because many blocks are in the buffer cache and, therefore, direct reads would not be used for reading the test table):

SQL> ALTER SYSTEM FLUSH BUFFER_CACHE;

  • Execute two queries on the test table and store the session statistics in a helper table:

SQL> CREATE TABLE s AS
  2  SELECT 0 AS run, name, value
  3  FROM v$statname NATURAL JOIN v$mystat;

SQL> SELECT count(*) FROM t;

  COUNT(*)
----------
    140000

SQL> INSERT INTO s
  2  SELECT 1, name, value
  3  FROM v$statname NATURAL JOIN v$mystat;

SQL> SELECT count(*) FROM t;

  COUNT(*)
----------
    140000

SQL> INSERT INTO s
  2  SELECT 2, name, value
  3  FROM v$statname NATURAL JOIN v$mystat;

  • Show some statistics for the two executions:

SQL> SELECT *
  2  FROM (
  3    SELECT run, name, value-lag(value) OVER (ORDER BY name, run) AS value
  4    FROM s
  5    WHERE name IN ('cleanouts only - consistent read gets',
  6                   'session logical reads',
  7                   'physical reads',
  8                   'physical reads direct')
  9  )
 10  PIVOT (sum(value) FOR run IN (1 AS RUN_1, 2 AS RUN2_2))
 11  ORDER BY name;

NAME                                          RUN_1     RUN2_2
---------------------------------------- ---------- ----------
cleanouts only - consistent read gets          9466       9399
physical reads                                10112      10004
physical reads direct                         10003      10003
session logical reads                         19850      19538

As you can see from the execution statistics, both runs did almost the same amount of (direct) reads, logical reads and cleanouts. Especially the last is strange. In fact, usually, the cleanouts are performed only once for each block. In this case, however, the same cleanouts are performed for every execution of the query.

This is clearly due to the direct reads.

To confirm that, let’s rerun the test without using serial direct reads. For that purpose, event 10949 is enabled.

SQL> TRUNCATE TABLE s;

SQL> ALTER SESSION SET EVENTS '10949 TRACE NAME CONTEXT FOREVER';

SQL> INSERT INTO s
  2  SELECT 0 AS run, name, value
  3  FROM v$statname NATURAL JOIN v$mystat;

SQL> SELECT count(*) FROM t;

  COUNT(*)
----------
    140000

SQL> INSERT INTO s
  2  SELECT 1, name, value
  3  FROM v$statname NATURAL JOIN v$mystat;

SQL> SELECT count(*) FROM t;

  COUNT(*)
----------
    140000

SQL> INSERT INTO s
  2  SELECT 2, name, value
  3  FROM v$statname NATURAL JOIN v$mystat;

SQL> SELECT *
  2  FROM (
  3    SELECT run, name, value-lag(value) OVER (ORDER BY name, run) AS value
  4    FROM s
  5    WHERE name IN ('cleanouts only - consistent read gets',
  6                   'session logical reads',
  7                   'physical reads',
  8                   'physical reads direct')
  9  )
 10  PIVOT (sum(value) FOR run IN (1 AS RUN_3, 2 AS RUN2_4))
 11  ORDER BY name;

NAME                                          RUN_3     RUN2_4
---------------------------------------- ---------- ----------
cleanouts only - consistent read gets          9399          0
physical reads                                 9919          0
physical reads direct                             0          0
session logical reads                         19592      10228

As you can see from the statistics, the major difference between the second and the third run is the number of direct reads. The interesting thing is to notice how the third run finally did a “regular” cleanout and, therefore, the subsequent run, the fourth one, was able to directly read the (cleaned) blocks. Because of that the number of logical reads of the fourth run was much lower.

24 Responses to “Impact of Direct Reads on Delayed Block Cleanouts”

  1. Comment: Chen Shapira

    Thanks for the great catch and the example code. I suspected something like this is happening, but was very confused since “everyone knows each block is only cleaned once”.

  2. Comment: Greg Rahn

    It’s interesting to note the difference in the metrics, but how would you describe the impact on a database system because of this new behavior? I’m trying to draw the line between difference in metrics and the difference in performance on a database system. What does it mean in terms of performance impact?

  3. Comment: Dion Cho

    Chris. This is a very natural but hard-to-find restriction.

    Here we have another reason for selecting the whole table in non-direct scan after big DML!

  4. Comment: Alberto Dell'Era

    Let me check my understanding: direct read means that the block is stored in RAM inside the process’ private memory (not in the buffer cache), and hence any modification to the block are made there and not persisted to disk. Correct ?

    I would guess that writing back to disk would be a bit complex; it would necessitate a synchronization with DBWR, checking that you are not writing back an older version of the block, etcetera …

  5. Comment: Christian Antognini

    Hi Greg, Dion

    First of all, it is true that such a behavior will probably cause performance problems only in specific cases. But, nevertheless, it’s in my opinion always good to know that such a possibility exist. Because of this problem, I would for example execute a silly query on a table that I know I just modified through a massive DML operation. Do you know, just in case ;-) This is especially true for data that will be “archived” or put in a read-only tablespace.

    Now, about the actual performance impact… Of course, it depends! In this specific case, let me show you a couple of figures (for producing them, I used the same queries of the post but with a table storing about 5GB of data).

    If the query is I/O bound (no data cached in the Linux buffer cache), I have measured an elapsed time of 51 seconds (8.3 seconds on CPU) with cleanouts and 46 seconds (6.1 seconds on CPU) without them. So, the execution with cleanouts is about 10% slower and uses about 30% more CPU.

    If the query is CPU bound (data is cached in the Linux buffer cache), I have measured an elapsed time of 6.1 seconds (100% of the time on CPU) with cleanouts and 4.0 seconds (100% of the time on CPU) without them. So, the execution with cleanouts is about 50% slower and uses about 50% more CPU.

    Cheers,
    Chris

  6. Comment: Christian Antognini

    Ciao Alberto

    > any modification to the block are made there and not persisted to disk. Correct ?

    Yes. And, therefore, they have to be repeated every time the block is read.

    Sa vedum,
    Chris

  7. Comment: Hemant K Chitale

    A delayed block cleanout does require updating the block (and, at some time, writing back to disk). Both operations would be through the SGA (the actual write, I expect, still has to be done by DBWR). A direct read bypassing the SGA cannot be expected (or allowed !) to update the block.
    Another non direct read session must “take the burden” of doing the block cleanout.

  8. Comment: Dion Cho

    Chris.

    In your testcase, how do you define the exact meaning of “cleanouts only – consistent read gets”?

    It seems to me that the cleanout by serial direct path read is not real cleanout, but just to read the undo info to confirm whether the block is dirty or not.

    My test case shows that two cases(direct read vs. conventional read after big update) show a big difference in redo size.

    [sourcecode language='sql']
    – serial direct read
    redo size 4,156
    – convential read
    redo size 149,184
    [/sourcecode]

    If serial direct path read really cleans out the dirty blocks, wouldn’t it show similar redo size?

  9. Comment: Christian Antognini

    Hemant

    To avoid misunderstandings, let me stress that I do not expect that a direct read updates the block on disk. Of course it cannot. With this post I just wanted to point out that behavior.

    Cheers,
    Chris

  10. Comment: Christian Antognini

    Dion

    > how do you define the exact meaning of “cleanouts only – consistent read gets”?

    I’m not sure to understand your question. Did you ask it because the statistic contains the word “only”? If yes, please notice that there is another statistics named “cleanouts and rollbacks – consistent read gets”. Their definition, according to the Referece Guide, is the following:

    cleanouts only – consistent read gets: Number of consistent gets that require only block cleanouts, no rollbacks.
    cleanouts and rollbacks – consistent read gets: Number of consistent gets that require both block rollbacks and block cleanouts.

    > My test case shows that two cases(direct read vs. conventional read after big update) show a big difference in redo size.

    I notice the same behavior. This is probably due to the fact that the block is not really modified. Honestly, I didn’t investigate further in this direction. But it could be interesting to extract the information stored in redo logs…

    Cheers,
    Chris

  11. Comment: Dion Cho

    The reference guide! What a shame I didn’t check it. :(

    The redo generated by serial direct path read seems to be the minimum workload necessary to activate the direct path read – maybe related to object checkpoint. As you commented, redo log will show exact info.

    I think that the serial direct path read does not cleanout dirty blocks multiple times(As per your expression, the blocks wouldn’t be really modified multiple times). It would just read additional blocks to check the status of the blocks.

  12. Comment: Christian Antognini

    Dion

    > maybe related to object checkpoint

    Yep. That should be the reason.

    > I think that the serial direct path read does not cleanout dirty blocks multiple times

    The block on disk must be cleaned out. So, there is much more than only checking the transaction table in the undo segment. About that point I did several crosschecks between execution statistics, block dumps, event 10203… everything is consistent and leads to the conclusion that a cleanout of every block is be performed. The only exception is for the blocks that are fast cleaned out at commit time.

    Cheers,
    Chris

  13. Pingback: Blogroll Report 17/07/2009 – 24/07/2009 « Coskan’s Approach to Oracle

    [...] Antognini-Impact of Direct Reads on Delayed Block Cleanouts Dion Cho-Disabling direct path read for the serial full table scan – [...]

  14. Pingback: Serial Direct Path Read and Block Cleanout(11g) « Dion Cho – Oracle Performance Storyteller

    [...] Antognini posted an article about the serial direct path read and repeated block cleanout problem here. I’d just like to make it clear what really happens inside [...]

  15. Pingback: Log Buffer #156: a Carnival of the Vanities for DBAs | Pythian Group Blog

    [...] Striving for Optimal Performance, Christian Antognini examines the impact of direct reads on delayed block cleanouts, one of the “most important cons,” in his opinion, of serial full table scans can using [...]

  16. Comment: Todor Botev

    I suppose if you select ora_rowscn’s in your test runs 1,2,3 they will be different in each run.
    In test run 4 they should be the same like in run 3.
    Am I right?

  17. Comment: Christian Antognini

    Hi Todor

    For all runs the ora_rowscn is the same. Makes sense…

    Cheers,
    Chris

  18. Comment: Donatello Settembrino

    Hi Christian, I read your interesting test and this connection, I bring
    to your attention some doubts on manage of direct reads.

    I did two tests, the first on DB version 10g, the other in 11g to try to
    understand if the principle is the same for the two versions DB. I know
    that a condition for Oracle decides to perform direct reads is that the
    number of blocks should be about 5 times the value of hidden parameter
    _small_table_threshold.

    test 10g

    
    SQL> select * from v$version;
    
    BANNER
    ----------------------------------------------------------------
    Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
    PL/SQL Release 10.2.0.4.0 - Production
    CORE    10.2.0.4.0      Production
    TNS for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Productio
    NLSRTL Version 10.2.0.4.0 - Production
    
    NAME                     VALUE      DEFLT      TYPE         DESCRIPTION                      
    ------------------------ ---------- ------------------------------------------------------------
    _serial_direct_read      FALSE      TRUE       boolean      enable direct read in serial
    
    SQL> select ksppstvl small_tab_threshold, ksppstvl * 5 as block_for_direct_reads
      2  from sys.x_$ksppi x, sys.x_$ksppsv y
      3  where (x.indx = y.indx)
      4  and ksppinm='_small_table_threshold';
    
    SMALL_TAB_THRESHOLD   BLOCK_FOR_DIRECT_READS
    --------------------  ----------------------
    2130                                  10650
    
    

    so with a number of blocks greater than 10650, the direct reads should be used

    SQL> create table t (x char(1000), y char(1000), z char(1000));
    
    SQL> insert into t select 'x', 'y', 'z' from dual connect by level <= 200000;
    
    SQL> commit;
    
    SQL> alter system flush buffer_cache;
    
    NAME                                          VALUE
    ---------------------------------------- ----------
    physical reads                                  178
    physical reads direct                             0
    
    SQL> select count(*) from t;
    
      COUNT(*)
    ----------
        200000
    
    NAME                             VALUE
    --------------------------- ----------
    physical reads                   50626
    physical reads direct                0
    
    

    but this doesn’t happen. For completeness, I also see the number of blocks of the table

    SQL> select num_rows, blocks from user_tables where table_name = 'T';
    
      NUM_ROWS     BLOCKS
    ---------- ----------
        200000      50437
    

    I run the same test at 11g:

    
    BANNER
    --------------------------------------------------------------------------------
    Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production
    PL/SQL Release 11.1.0.6.0 - Production
    CORE    11.1.0.6.0      Production
    TNS for IBM/AIX RISC System/6000: Version 11.1.0.6.0 - Production
    NLSRTL Version 11.1.0.6.0 - Production
    
    NAME                     VALUE      DEFLT      TYPE         DESCRIPTION                      
    ------------------------ ---------- ------------------------------------------------------------
    _serial_direct_read      FALSE      TRUE       boolean      enable direct read in serial      
    
    SMALL_TAB_THRESHOLD   BLOCK_FOR_DIRECT_READS
    --------------------  ----------------------
    1424                                    7120
    
    

    So, I have the same situation that I had in 10g, but with a different value _small_table_threshold

    
    SQL> create table t (x char(1000), y char(1000), z char(1000));
    
    SQL> insert into t select 'x', 'y', 'z' from dual connect by level <= 200000;
    
    SQL> commit;
    
    SQL> alter system flush buffer_cache;
    
    NAME                          VALUE
    ------------------------ ----------
    physical reads                  156
    physical reads direct             0
    
    SQL> select count(*) from t;
    
      COUNT(*)
    ----------
        200000
    
    NAME                                VALUE
    ------------------------------ ----------
    physical reads                      50754
    physical reads direct               50183
    
    

    I noticed that in 11g, on equal terms with the test conducted in the version
    10g, then with _serial_direct_read -> FALSE and the number of blocks (50437) exceeds 5
    times the value of _small_table_threshold (7120) the direct reads are applied
    as not done in 10g

    
    SQL> select num_rows, blocks from dba_tables where TABLE_NAME = 'T';
    
      NUM_ROWS     BLOCKS
    ---------- ----------
        200000      50437
    
    

    But then this means that even if Oracle version 11g
    _serial_direct_read = FALSE (but the number of blocks is greater than 5 times the value
    _small_table_threshold most of the other conditions to which I referred you)
    the direct reads are performed?

    Is my interpretation correct?

    Thanks a lot

    Donatello

  19. Comment: Christian Antognini

    Ciao Donatello

    Yes, your interpretation is correct. Just note that the initialization _serial_direct_read exists from a long time (8i?) and, probably for that reason, it is not (directly) related to the serial direct reads that are used in recent releases.

    Cheers,
    Chris

  20. Comment: Donatello Settembrino

    Ciao Christian,
    I would like to ask you one last question , this time as
    regards direct reads and delayed block cleanout, in 10g:

    
    dsettembrino>alter session set "_serial_direct_read" = true;
    
    dsettembrino>create table t (x char(1000), y char(1000), z char(1000));
    
    dsettembrino>insert into t select 'x', 'y', 'z' from dual connect by level <= 200000;
    
    dsettembrino>commit;
    
    

    I check if there are dirty blocks, as lawful expect:

    
    dsettembrino>select v.dirty, count(*)
      2  from v$bh v
      3  where v.objd = (select object_id from user_objects where object_name = 'T')
      4  group by v.dirty;
    
    D   COUNT(*)
    - ----------
    Y      21578
    N       1582
    
    

    This means that there are dirty blocks in buffer cache, insert’s blocks are
    by definition. I run some statistical checks before executing my query:

    dsettembrino>select n.name, m.value
      2  from  v$mystat m, v$statname n
      3  where m.STATISTIC# = n.STATISTIC#
      4  and   n.name in ('redo size', 'cleanouts only - consistent read gets' 
      5                    , 'direct path read', 'consistent gets'
      6                    , 'physical reads', 'physical reads'
      7                    ,'physical reads direct', 'undo change vector size');
    
    NAME                                                                  VALUE
    ---------------------------------------------------------------- ----------
    consistent gets                                                       92975
    physical reads                                                           51
    physical reads direct                                                     0
    redo size                                                         634580860
    undo change vector size                                             5829132
    cleanouts only - consistent read gets                                     0
    
    

    I run my query:

    dsettembrino>select count(*) from t;
    
      COUNT(*)
    ----------
        200000
    
    

    So,I rerun the statistics:

    dsettembrino>select n.name, m.value
      2  from  v$mystat m, v$statname n
      3  where m.STATISTIC# = n.STATISTIC#
      4  and   n.name in ('redo size', 'cleanouts only - consistent read gets' 
      5                   , 'direct path read', 'consistent gets'
      6                   , 'physical reads', 'physical reads'
      7                   ,'physical reads direct', 'undo change vector size');
    
    NAME                                                                  VALUE
    ---------------------------------------------------------------- ----------
    consistent gets                                                      183930
    physical reads                                                        50553
    physical reads direct                                                 50118
    redo size                                                         634586868
    undo change vector size                                             5829132
    cleanouts only - consistent read gets                                 40506
    
    

    My interpretation is the following:

    When I run my query (the select count (*) from t;) there are
    blocks in buffer cache. I see from the statistic that Oracle
    has solved my query using direct reads. To do that, Oracle
    must make changes to blocks, to download it from SGA and then
    bring these blocks into PGA, since the engine Oracle has
    established that the circumstances require the use of direct reads.

    1)Is it correct what I say?

    2)If my hypothesis is correct, it means that during the query
    (select count (*) from t) were performed delayed block cleanout
    for a number of blocks (the ones on disk) + fast commit to a number
    blocks (those who were SGA) before the engine execute direct reads?

    Kind Regards,
    Donatello

  21. Comment: Christian Antognini

    Ciao Donatello

    I’m not sure to fully understand what do you mean… This is probably because you are confused by the dirty blocks in the SGA. The important thing you have to remember is that before doing direct reads a segment checkpoint has to be performed. Because of the segment checkpoint, all blocks are on disk and, therefore, every time that they are read through a direct read a deferred cleanout has to be performed.

    HTH
    Chris

  22. Comment: Donatello Settembrino

    Ciao Christian,

    I have explained myself badly, I understood from what you had written that
    a key prerequisite is that a checkpoint has been executed,
    but what I meant is that after I insert (ie before the
    select count (*)…) saw a number bi dirty blocks in SGA

    
    D   COUNT(*)
    - ----------
    Y      21578
    N       1582
    
    

    and this means that (the image of those blocks does not match that on disk
    also because some blocks there have never been on disk, right?).
    I’d like to have only confirmed the following concept:

    when I run “select count (*) from t” and I see that Oracle uses
    “Direct reads” …

    
    NAME                        VALUE
    --------------------------- ----------
    physical reads                      51
    physical reads direct                0
    
    select count(*) from t;
    
    NAME                        VALUE
    --------------------------- ----------
    physical reads                   50553
    physical reads direct            50118
    
    

    … this means that Oracle has done in the meantime
    a checkpoint and then brought up the disc, even blocks that
    were in SGA? If not, I would not expect “direct reads”
    but “conventional reads.

    Thanks

  23. Comment: Christian Antognini

    Ciao Donatello

    > what I meant is that after I insert (ie before the
    > select count (*)…) saw a number dirty blocks in SGA

    > the image of those blocks does not match that on disk
    > also because some blocks there have never been on disk, right?

    Yes, a regular insert writes into the buffer cache only.

    > when I run “select count (*) from t” and I see that Oracle uses
    > “Direct reads” …
    >
    > … this means that Oracle has done in the meantime
    > a checkpoint and then brought up the disc, even blocks that
    > were in SGA?

    Yes.

    And now I understand what bothers you :-) It is effectively so that the DBWR has to write some blocks to disk and just after that the server process has to read them back.

    HTH
    Chris

  24. Pingback: True or False – Direct Path Reads « Charles Hooper's Oracle Notes

Leave a Reply

All comments are filtered by Akismet and then moderated by me. So, it should be no surprise that spam and anything wholly inappropriate has no chance of making it onto the site. To send questions or comments not related to this blog post please use the contact form.

Your name (required)

Your email (will not be published) (required)

Your website

Your message (please wrap code examples in <pre> tags)