Parallel Full Table Scans Do Not Always Perform Direct Reads

Even though in general parallel full table scans performs direct reads, some exceptions exist. The aim of this post is to show such an exception.

For test purposes I build in my own schema a copy of the SH.SALES table (the one distributed by Oracle with the demo schemas…). On that table I build an index on the TIME_ID column and, at the same time, I trace the execution. The statements I use are the following.

execute dbms_monitor.session_trace_enable
CREATE INDEX sales_time_id ON sales (time_id) PARALLEL 2 ONLINE;
execute dbms_monitor.session_trace_disable

Since the index is built in parallel (DOP=2), 5 processes are used to run the statement: the query coordinator, 2 slaves for reading the table, and 2 slaves for building the index. Based on the data provided by extended SQL trace let’s have a look to some information about the execution.

  • Execution plan (without runtime statistics and query optimizer estimations): both the build of the index and the full table scan are performed in parallel.
Operation
----------------------------------------
PX COORDINATOR
  PX SEND QC (ORDER) :TQ10001
    INDEX BUILD NON UNIQUE SALES_TIME_ID
      SORT CREATE INDEX
        PX RECEIVE
          PX SEND RANGE :TQ10000
            PX BLOCK ITERATOR
              TABLE ACCESS FULL SALES
  • Resource usage profile of the query coordinator: no real work is performed, it’s just coordination work…
                                               Total            Number of     Duration per
Component                               Duration [s]       %       Events       Events [s]
----------------------------------- ---------------- ------- ------------ ----------------
PX Deq: Execute Reply                          1.928  79.044           44            0.044
recursive statements                           0.427  17.487          n/a              n/a
CPU                                            0.024   0.984          n/a              n/a
os thread startup                              0.017   0.690            1            0.017
log file sync                                  0.014   0.569            2            0.007
PX Deq: Parse Reply                            0.014   0.568            4            0.003
enq: CR - block range reuse ckpt               0.005   0.202            2            0.002
PX Deq: Join ACK                               0.004   0.145            5            0.001
SQL*Net message from client                    0.002   0.098            1            0.002
PX qref latch                                  0.002   0.075            1            0.002
PX Deq: Table Q qref                           0.001   0.058            1            0.001
enq: RO - fast object reuse                    0.001   0.028            1            0.001
PX Deq: Signal ACK EXT                         0.001   0.025            6            0.000
PX Deq: Slave Session Stats                    0.000   0.014            3            0.000
reliable message                               0.000   0.006            2            0.000
latch: call allocation                         0.000   0.004            1            0.000
db file sequential read                        0.000   0.004            6            0.000
rdbms ipc reply                                0.000   0.002            2            0.000
SQL*Net message to client                      0.000   0.000            1            0.000
----------------------------------- ---------------- -------
Total                                          2.439 100.000
  • Resource usage profile of one of the slaves building the index: direct writes are used to store the index.
                                               Total            Number of     Duration per
Component                               Duration [s]       %       Events       Events [s]
----------------------------------- ---------------- ------- ------------ ----------------
direct path write                              0.982  49.671          376            0.003
CPU                                            0.781  39.508          n/a              n/a
PX Deq: Table Q Normal                         0.177   8.957          130            0.001
PX Deq: Execution Msg                          0.012   0.595            4            0.003
cursor: pin S wait on X                        0.011   0.535            1            0.011
recursive statements                           0.008   0.420          n/a              n/a
log file sync                                  0.005   0.247            3            0.002
Disk file operations I/O                       0.001   0.030            1            0.001
PX Deq: Slave Session Stats                    0.000   0.021            1            0.000
reliable message                               0.000   0.011            1            0.000
rdbms ipc reply                                0.000   0.006            2            0.000
----------------------------------- ---------------- -------
Total                                          1.977 100.000
  • Resource usage profile of one of the slaves scanning the table: instead of performing direct reads, “regular” buffered reads are performed (notice the db file scattered read event).
                                               Total            Number of     Duration per
Component                               Duration [s]       %       Events       Events [s]
----------------------------------- ---------------- ------- ------------ ----------------
PX Deq: Execution Msg                          1.571  80.295           23            0.068
CPU                                            0.311  15.889          n/a              n/a
PX Deq Credit: need buffer                     0.055   2.794          432            0.000
db file scattered read                         0.016   0.835           55            0.000
PX Deq: Table Q Get Keys                       0.002   0.109            1            0.002
Disk file operations I/O                       0.001   0.032            1            0.001
PX Deq Credit: send blkd                       0.001   0.028            1            0.001
PX Deq: Slave Session Stats                    0.000   0.010            1            0.000
db file sequential read                        0.000   0.005            4            0.000
PX qref latch                                  0.000   0.002            1            0.000
latch: cache buffers chains                    0.000   0.000            1            0.000
----------------------------------- ---------------- -------
Total                                          1.957 100.000

As pointed out by the last resource usage profile no direct reads are performed. Why? In this case it is because the ONLINE option was specified. By the way, I do not know why there is such a limitation… Anway, without this option, for one of the slaves reading the table the following resource usage profile is used (notice the direct path read event). I do not show the other resource usage profiles and the execution plan because they do not change.

                                               Total            Number of     Duration per
Component                               Duration [s]       %       Events       Events [s]
----------------------------------- ---------------- ------- ------------ ----------------
PX Deq: Execution Msg                          1.499  80.358           22            0.068
CPU                                            0.278  14.897          n/a              n/a
PX Deq Credit: need buffer                     0.071   3.790          504            0.000
direct path read                               0.012   0.630           52            0.000
PX Deq Credit: send blkd                       0.003   0.176            2            0.002
PX Deq: Table Q Get Keys                       0.001   0.047            2            0.000
PX Deq: Slave Session Stats                    0.001   0.041            1            0.001
Disk file operations I/O                       0.001   0.033            1            0.001
library cache: mutex X                         0.000   0.025            1            0.000
db file sequential read                        0.000   0.002            2            0.000
asynch descriptor resize                       0.000   0.001           18            0.000
----------------------------------- ---------------- -------
Total                                          1.866 100.000

In summary, do not expect to always see direct reads when a parallel full table scan is performed.

8 Comments

  1. Flado Flado
    September 13, 2010    

    > In this case it is because the ONLINE option was specified. By the way, I do not know why there is such a limitation…

    Maybe because Oracle needs a consistent view of the table at the time the “create index” started? Presumably, there will be updates to the table while the index is being built – these will be gathered and applied at the end, but the application process needs to start with the index “as of” when the gathering started.
    Just a guess.
    Greetings,
    Flado

  2. September 13, 2010    

    Hi Flado

    Even when direct reads are performed, read consistency must be guaranteed. To do so an object checkpoint is issued before performing the direct reads. As a result the necessary data is stored on disk. Therefore, I do not see why that should be different for an online index creation. IMO to read the data that has been updated since the CREATE INDEX started, another read, based on another SCN, is required.

    Thank you for sharing your opinion!

    Cheers,
    Chris

  3. Flado Flado
    September 13, 2010    

    Hi Chris! Thanks for answering :-)

    Another full scan (based on another SCN) seems too big a price to pay for the ONLINE part of an index rebuild. And what about the changes made while this second scan runs? No, I think the apply process must be very quick, so that the table may remain locked while it runs.
    I think that the fast object checkpoint trick works for SELECTs which must ignore the changes made while they run; in contrast, an index rebuild must incorporate those changes.

    Cheers,
    Flado

  4. Flado Flado
    September 13, 2010    

    You are right that the initial consistent view may be aquired in the same manner as for parallel SELECT, though. Hmmm… a puzzle…

    Cheers,
    Flado

  5. September 15, 2010    

    Hi Flado

    > Another full scan (based on another SCN) seems too big a price
    > to pay for the ONLINE part of an index rebuild.

    I wrote “another read” not “another full scan”. The only way to efficiently perform an online index creation is to know which rows are modified after the initial full scan is started. I.e. they have to use some kind of log to keep the rowids of those rows… and, during the last phase of the build, they access them directly.

    Cheers,
    Chris

  6. John John
    April 11, 2014    

    Christian
    Can we consider that a LIO on a block in PGA (resulted of a Direct Path Read) is less resource consumer than other readed from Buffer Cache?

    Thanks in advance, your block is fantastic.

    John

    • April 11, 2014    

      Hi John

      Since it doesn’t have to go through all the serialization steps…. yes, it can be considered “cheaper”.

      Cheers,
      Chris

    • John John
      April 14, 2014    

      Sorry, I meant “your blog is fantastic” :-)

      John

Leave a Reply

Your email address will not be published. Required fields are marked *


Time limit is exhausted. Please reload CAPTCHA.