Feb 17 2011

IS NULL Conditions and B-tree Indexes

Tag: 10gR1, 10gR2, 11gR1, 11gR2, 9iR2, Indexes, Query Optimizer, TOPChristian Antognini @ 11:01 am

At page 383 of my book I wrote the following sentence (BTW, the same information is also provided by Table 9-3 at page 381):

With B-tree indexes, IS NULL conditions can be applied only through composite B-tree indexes when several SQL conditions are applied and at least one of them is not based on IS NULL or an inequality.

The text continues by showing the following examples (notice that in both cases the IS NULL predicate is applied through an access predicate):

SELECT /*+ index(t) */ * FROM t WHERE n1 = 6 AND n2 IS NULL

Plan hash value: 780655320

----------------------------------------------
| Id  | Operation                   | Name   |
----------------------------------------------
|   0 | SELECT STATEMENT            |        |
|   1 |  TABLE ACCESS BY INDEX ROWID| T      |
|*  2 |   INDEX RANGE SCAN          | I_N123 |
----------------------------------------------

   2 - access("N1"=6 AND "N2" IS NULL)

SELECT /*+ index(t) */ * FROM t WHERE n1 IS NULL AND n2 = 8

Plan hash value: 780655320

----------------------------------------------
| Id  | Operation                   | Name   |
----------------------------------------------
|   0 | SELECT STATEMENT            |        |
|   1 |  TABLE ACCESS BY INDEX ROWID| T      |
|*  2 |   INDEX RANGE SCAN          | I_N123 |
----------------------------------------------

   2 - access("N1" IS NULL AND "N2"=8)
       filter("N2"=8)

When I wrote that sentence I didn’t think about one case that, according to it, specifically the part “is not based on IS NULL or an inequality”, is not covered. In fact, as the following examples show, it is also possible to apply an IS NULL predicate when the other one is an IS NOT NULL. It is especially interesting to notice that the access predicate doesn’t reference at all the NOT NULL column!

SELECT /*+ index(t) */ * FROM t WHERE n1 IS NULL AND n2 IS NOT NULL

Plan hash value: 780655320

----------------------------------------------
| Id  | Operation                   | Name   |
----------------------------------------------
|   0 | SELECT STATEMENT            |        |
|   1 |  TABLE ACCESS BY INDEX ROWID| T      |
|*  2 |   INDEX RANGE SCAN          | I_N123 |
----------------------------------------------

   2 - access("N1" IS NULL)
       filter("N2" IS NOT NULL)

SELECT /*+ index(t) */ * FROM t WHERE n1 IS NOT NULL AND n2 IS NULL

Plan hash value: 3029444779

----------------------------------------------
| Id  | Operation                   | Name   |
----------------------------------------------
|   0 | SELECT STATEMENT            |        |
|   1 |  TABLE ACCESS BY INDEX ROWID| T      |
|*  2 |   INDEX SKIP SCAN           | I_N123 |
----------------------------------------------

   2 - access("N2" IS NULL)
       filter(("N2" IS NULL AND "N1" IS NOT NULL))


Sep 12 2010

Parallel Full Table Scans Do Not Always Perform Direct Reads

Tag: 10gR1, 10gR2, 11gR1, 11gR2, Indexes, Parallel ProcessingChristian Antognini @ 11:52 am

Even though in general parallel full table scans performs direct reads, some exceptions exist. The aim of this post is to show such an exception.

For test purposes I build in my own schema a copy of the SH.SALES table (the one distributed by Oracle with the demo schemas…). On that table I build an index on the TIME_ID column and, at the same time, I trace the execution. The statements I use are the following.

execute dbms_monitor.session_trace_enable
CREATE INDEX sales_time_id ON sales (time_id) PARALLEL 2 ONLINE;
execute dbms_monitor.session_trace_disable

Since the index is built in parallel (DOP=2), 5 processes are used to run the statement: the query coordinator, 2 slaves for reading the table, and 2 slaves for building the index. Based on the data provided by extended SQL trace let’s have a look to some information about the execution.

  • Execution plan (without runtime statistics and query optimizer estimations): both the build of the index and the full table scan are performed in parallel.

Operation
----------------------------------------
PX COORDINATOR
  PX SEND QC (ORDER) :TQ10001
    INDEX BUILD NON UNIQUE SALES_TIME_ID
      SORT CREATE INDEX
        PX RECEIVE
          PX SEND RANGE :TQ10000
            PX BLOCK ITERATOR
              TABLE ACCESS FULL SALES

  • Resource usage profile of the query coordinator: no real work is performed, it’s just coordination work…

                                               Total            Number of     Duration per
Component                               Duration [s]       %       Events       Events [s]
----------------------------------- ---------------- ------- ------------ ----------------
PX Deq: Execute Reply                          1.928  79.044           44            0.044
recursive statements                           0.427  17.487          n/a              n/a
CPU                                            0.024   0.984          n/a              n/a
os thread startup                              0.017   0.690            1            0.017
log file sync                                  0.014   0.569            2            0.007
PX Deq: Parse Reply                            0.014   0.568            4            0.003
enq: CR - block range reuse ckpt               0.005   0.202            2            0.002
PX Deq: Join ACK                               0.004   0.145            5            0.001
SQL*Net message from client                    0.002   0.098            1            0.002
PX qref latch                                  0.002   0.075            1            0.002
PX Deq: Table Q qref                           0.001   0.058            1            0.001
enq: RO - fast object reuse                    0.001   0.028            1            0.001
PX Deq: Signal ACK EXT                         0.001   0.025            6            0.000
PX Deq: Slave Session Stats                    0.000   0.014            3            0.000
reliable message                               0.000   0.006            2            0.000
latch: call allocation                         0.000   0.004            1            0.000
db file sequential read                        0.000   0.004            6            0.000
rdbms ipc reply                                0.000   0.002            2            0.000
SQL*Net message to client                      0.000   0.000            1            0.000
----------------------------------- ---------------- -------
Total                                          2.439 100.000

  • Resource usage profile of one of the slaves building the index: direct writes are used to store the index.

                                               Total            Number of     Duration per
Component                               Duration [s]       %       Events       Events [s]
----------------------------------- ---------------- ------- ------------ ----------------
direct path write                              0.982  49.671          376            0.003
CPU                                            0.781  39.508          n/a              n/a
PX Deq: Table Q Normal                         0.177   8.957          130            0.001
PX Deq: Execution Msg                          0.012   0.595            4            0.003
cursor: pin S wait on X                        0.011   0.535            1            0.011
recursive statements                           0.008   0.420          n/a              n/a
log file sync                                  0.005   0.247            3            0.002
Disk file operations I/O                       0.001   0.030            1            0.001
PX Deq: Slave Session Stats                    0.000   0.021            1            0.000
reliable message                               0.000   0.011            1            0.000
rdbms ipc reply                                0.000   0.006            2            0.000
----------------------------------- ---------------- -------
Total                                          1.977 100.000

  • Resource usage profile of one of the slaves scanning the table: instead of performing direct reads, “regular” buffered reads are performed (notice the db file scattered read event).

                                               Total            Number of     Duration per
Component                               Duration [s]       %       Events       Events [s]
----------------------------------- ---------------- ------- ------------ ----------------
PX Deq: Execution Msg                          1.571  80.295           23            0.068
CPU                                            0.311  15.889          n/a              n/a
PX Deq Credit: need buffer                     0.055   2.794          432            0.000
db file scattered read                         0.016   0.835           55            0.000
PX Deq: Table Q Get Keys                       0.002   0.109            1            0.002
Disk file operations I/O                       0.001   0.032            1            0.001
PX Deq Credit: send blkd                       0.001   0.028            1            0.001
PX Deq: Slave Session Stats                    0.000   0.010            1            0.000
db file sequential read                        0.000   0.005            4            0.000
PX qref latch                                  0.000   0.002            1            0.000
latch: cache buffers chains                    0.000   0.000            1            0.000
----------------------------------- ---------------- -------
Total                                          1.957 100.000

As pointed out by the last resource usage profile no direct reads are performed. Why? In this case it is because the ONLINE option was specified. By the way, I do not know why there is such a limitation… Anway, without this option, for one of the slaves reading the table the following resource usage profile is used (notice the direct path read event). I do not show the other resource usage profiles and the execution plan because they do not change.

                                               Total            Number of     Duration per
Component                               Duration [s]       %       Events       Events [s]
----------------------------------- ---------------- ------- ------------ ----------------
PX Deq: Execution Msg                          1.499  80.358           22            0.068
CPU                                            0.278  14.897          n/a              n/a
PX Deq Credit: need buffer                     0.071   3.790          504            0.000
direct path read                               0.012   0.630           52            0.000
PX Deq Credit: send blkd                       0.003   0.176            2            0.002
PX Deq: Table Q Get Keys                       0.001   0.047            2            0.000
PX Deq: Slave Session Stats                    0.001   0.041            1            0.001
Disk file operations I/O                       0.001   0.033            1            0.001
library cache: mutex X                         0.000   0.025            1            0.000
db file sequential read                        0.000   0.002            2            0.000
asynch descriptor resize                       0.000   0.001           18            0.000
----------------------------------- ---------------- -------
Total                                          1.866 100.000

In summary, do not expect to always see direct reads when a parallel full table scan is performed.


Aug 02 2010

Partition-Wise Join of List-Partitioned Tables

Tag: 10gR1, 10gR2, 11gR1, 11gR2, Partitioning, Query OptimizerChristian Antognini @ 10:59 am

When two tables are equi-partitioned on their join keys, the query optimizer is able to take advantage of partition-wise joins. To make sure that the tables are equi-partitioned, as of Oracle Database 11g reference partitioning can be used. In fact, per definition, with reference partitioning all “related” tables have exactly the same partitioning schema. If you are not using reference partitioning, you must be very careful that the tables are effectively partitioned in very same way. For range and hash partitioned tables this is usually not a problem. However, when using list partitioning, it is quite easy to make a mistake. The reason is that the partitions can be defined in any order. Let’s have a look to an example based on the following two tables.

SQL> CREATE TABLE t1p
  2  PARTITION BY LIST (pkey) (
  3    PARTITION p_0 VALUES (0),
  4    PARTITION p_1 VALUES (1),
  5    PARTITION p_2 VALUES (2),
  6    PARTITION p_3 VALUES (3),
  7    PARTITION p_4 VALUES (4),
  8    PARTITION p_5 VALUES (5),
  9    PARTITION p_6 VALUES (6),
 10    PARTITION p_7 VALUES (7),
 11    PARTITION p_8 VALUES (8),
 12    PARTITION p_9 VALUES (9)
 13  )
 14  AS
 15  SELECT rownum AS num, mod(rownum,10) AS pkey, dbms_random.string('p',50) AS pad
 16  FROM dual
 17  CONNECT BY level <= 10000;

SQL> CREATE TABLE t2p
  2  PARTITION BY LIST (pkey) (
  3    PARTITION p_0 VALUES (0),
  4    PARTITION p_1 VALUES (1),
  5    PARTITION p_2 VALUES (2),
  6    PARTITION p_3 VALUES (3),
  7    PARTITION p_5 VALUES (5),
  8    PARTITION p_4 VALUES (4),
  9    PARTITION p_6 VALUES (6),
 10    PARTITION p_7 VALUES (7),
 11    PARTITION p_8 VALUES (8),
 12    PARTITION p_9 VALUES (9)
 13  )
 14  AS
 15  SELECT rownum AS num, mod(rownum,10) AS pkey, dbms_random.string('p',50) AS pad
 16  FROM dual
 17  CONNECT BY level <= 10000;

SQL> BEGIN
  2    dbms_stats.gather_table_stats(user,'t1p');
  3    dbms_stats.gather_table_stats(user,'t2p');
  4  END;
  5  /

Even though they are logically equivalent, as shown in the following execution plan, with them partition-wise joins cannot be used.

SQL> EXPLAIN PLAN FOR SELECT * FROM t1p JOIN t2p USING (num, pkey);

SQL> SELECT * FROM table(dbms_xplan.display(format=>'basic'));

PLAN_TABLE_OUTPUT
------------------------------------

Plan hash value: 3059592055

------------------------------------
| Id  | Operation           | Name |
------------------------------------
|   0 | SELECT STATEMENT    |      |
|   1 |  HASH JOIN          |      |
|   2 |   PARTITION LIST ALL|      |
|   3 |    TABLE ACCESS FULL| T1P  |
|   4 |   PARTITION LIST ALL|      |
|   5 |    TABLE ACCESS FULL| T2P  |
------------------------------------

The difference in the order of the partitions can also be confirmed by a query like the following one.

SQL> SELECT t1p.high_value,
  2         t1p.partition_position AS pos_t1p,
  3         t2p.partition_position AS pos_t2p,
  4         decode(t1p.partition_position, t2p.partition_position, 'Y', 'N') AS equal
  5  FROM user_tab_partitions t1p JOIN user_tab_partitions t2p ON t1p.partition_name = t2p.partition_name
  6  WHERE t1p.table_name = 'T1P'
  7  AND t2p.table_name = 'T2P';

HIGH_VALUE   POS_T1P  POS_T2P EQUAL
----------- -------- -------- ------
0                  1        1 Y
1                  2        2 Y
2                  3        3 Y
3                  4        4 Y
5                  6        5 N
4                  5        6 N
6                  7        7 Y
7                  8        8 Y
8                  9        9 Y
9                 10       10 Y

It goes without saying that to solve the problem it is necessary to reorder the partitions. To do so it is enough to move the out-of-order partitions. To avoid a double storage of the data a series of ALTER TABLE EXCHANGE/DROP/ADD/EXCHANGE statements can be used.

  • Move the P5 partition of the T1P table

SQL> CREATE TABLE t1p_5 AS
  2  SELECT *
  3  FROM t1p PARTITION (p_5)
  4  WHERE 1 = 0;

SQL> ALTER TABLE t1p EXCHANGE PARTITION p_5 WITH TABLE t1p_5;

SQL> ALTER TABLE t1p DROP PARTITION p_5;

SQL> ALTER TABLE t1p ADD PARTITION p_5 VALUES (5);

SQL> ALTER TABLE t1p EXCHANGE PARTITION p_5 WITH TABLE t1p_5;

SQL> DROP TABLE t1p_5 PURGE;

  • Move the P5 partition of the T2P table

SQL> CREATE TABLE t2p_5 AS
  2  SELECT *
  3  FROM t2p PARTITION (p_5)
  4  WHERE 1 = 0;

SQL> ALTER TABLE t2p EXCHANGE PARTITION p_5 WITH TABLE t2p_5;

SQL> ALTER TABLE t2p DROP PARTITION p_5;

SQL> ALTER TABLE t2p ADD PARTITION p_5 VALUES (5);

SQL> ALTER TABLE t2p EXCHANGE PARTITION p_5 WITH TABLE t2p_5;

SQL> DROP TABLE t2p_5 PURGE;

  • Check whether the order is ok

SQL> SELECT t1p.high_value,
  2         t1p.partition_position AS pos_t1p,
  3         t2p.partition_position AS pos_t2p,
  4         decode(t1p.partition_position, t2p.partition_position, 'Y', 'N') AS equal
  5  FROM user_tab_partitions t1p JOIN user_tab_partitions t2p ON t1p.partition_name = t2p.partition_name
  6  WHERE t1p.table_name = 'T1P'
  7  AND t2p.table_name = 'T2P';

HIGH_VALUE   POS_T1P  POS_T2P EQUAL
----------- -------- -------- ------
0                  1        1 Y
1                  2        2 Y
2                  3        3 Y
3                  4        4 Y
4                  5        5 Y
6                  6        6 Y
7                  7        7 Y
8                  8        8 Y
9                  9        9 Y
5                 10       10 Y

After these operations partition-wise joins are allowed. The following execution plan confirms this.

SQL> SELECT * FROM table(dbms_xplan.display(format=>'basic'));

PLAN_TABLE_OUTPUT
------------------------------------

Plan hash value: 1324269388

------------------------------------
| Id  | Operation           | Name |
------------------------------------
|   0 | SELECT STATEMENT    |      |
|   1 |  PARTITION LIST ALL |      |
|   2 |   HASH JOIN         |      |
|   3 |    TABLE ACCESS FULL| T1P  |
|   4 |    TABLE ACCESS FULL| T2P  |
------------------------------------


Jun 03 2010

Optimizer Mode Mismatch Does Not Prevent Sharing of Child Cursor!?!?

Tag: 10gR1, 10gR2, 11gR1, 11gR2, 9iR2, Bug, Query Optimizer, SQL TraceChristian Antognini @ 6:40 pm

The aim of this post is to describe a strange (buggy) situation that I observed recently. But before doing that, I shortly summarize what a parent cursor and a child cursor are as well as when they can be shared. By the way, I borrowed this description from the pages 20/21 of my book. Hence, if you are interested in more information about this topic refer to it…

The result of a parse operation is a parent cursor and a child cursor stored in the library cache.

The key information related to a parent cursor is the text of the SQL statement. Therefore, several SQL statements share the same parent cursor if their text is exactly the same (note that there is at least an exception to this, specifically when cursor sharing is used). In the following example, four SQL statements are executed. Two have the same text. Two others differ only because of lowercase and uppercase letters or blanks. Through the V$SQLAREA view, it is possible to confirm that three distinct parent cursors were created.

SQL> ALTER SYSTEM FLUSH SHARED_POOL;

SQL> SELECT * FROM t WHERE n = 1234;

SQL> select * from t where n = 1234;

SQL> SELECT * FROM t WHERE n=1234;

SQL> SELECT * FROM t WHERE n = 1234;

SQL> SELECT sql_id, sql_text, executions
  2  FROM v$sqlarea
  3  WHERE sql_text LIKE '%1234';

SQL_ID        SQL_TEXT                          EXECUTIONS
------------- --------------------------------- ----------
2254m1487jg50 select * from t where n = 1234             1
g9y3jtp6ru4cb SELECT * FROM t WHERE n = 1234             2
7n8p5s2udfdsn SELECT * FROM t WHERE n=1234               1

The key information related to a child cursor is the execution plan and the execution environment related to it. The execution environment is important because if it changes, the execution plan might change as well. As a result, several SQL statements are able to share the same child cursor only if they share the same parent cursor and their execution environments are compatible. To illustrate, the same SQL statement is executed with two different values of the initialization OPTIMIZER_MODE parameter. The result is that a single parent cursor and two child cursors are created.

SQL> ALTER SESSION SET optimizer_mode = all_rows;

SQL> SELECT count(*) FROM t;

COUNT(*)
----------
      1000

SQL> ALTER SESSION SET optimizer_mode = first_rows_10;

SQL> SELECT count(*) FROM t;

COUNT(*)
----------
      1000

SQL> SELECT sql_id, child_number, sql_text, optimizer_mode, plan_hash_value
  2  FROM v$sql
  3  WHERE sql_id = (SELECT prev_sql_id
  4  FROM v$session
  5  WHERE sid = sys_context('userenv','sid'));

SQL_ID        CHILD_NUMBER SQL_TEXT               OPTIMIZER_MODE PLAN_HASH_VALUE
------------- ------------ ---------------------- -------------- ---------------
5tjqf7sx5dzmj            0 SELECT count(*) FROM t ALL_ROWS            2966233522
5tjqf7sx5dzmj            1 SELECT count(*) FROM t FIRST_ROWS          2966233522

To know which mismatch led to several child cursors, you can query the V$SQL_SHARED_CURSOR view.

SQL> SELECT child_number, optimizer_mode_mismatch
  2  FROM v$sql_shared_cursor
  3  WHERE sql_id = '5tjqf7sx5dzmj';

CHILD_NUMBER OPTIMIZER_MODE_MISMATCH
------------ -----------------------
           0 N
           1 Y

So far, so good… Now, let’s see what’s strange…

The interesting thing to point out about the previous example is that while I set FIRST_ROWS_10 as optimizer mode, the V$SQL view displayed the value FIRST_ROWS. Mhmm… That’s strange… They are two different optimizer modes. They cannot be considered equivalent. What are the implications? It is just the view that provides the wrong information or the database engine is able to share the same child cursor even with two different values of the OPTIMIZER_MODE parameter? Let’s try it with FIRST_ROWS (i.e. without “_10”)…

 SQL> ALTER SESSION SET optimizer_mode = first_rows;

SQL> SELECT sql_id, child_number, sql_text, optimizer_mode, executions
  2  FROM v$sql
  3  WHERE sql_id = (SELECT prev_sql_id
  4                  FROM v$session
  5                  WHERE sid = sys_context('userenv','sid'));

SQL_ID        CHILD_NUMBER SQL_TEXT                          OPTIMIZER_MODE EXECUTIONS
------------- ------------ --------------------------------- -------------- ----------
5tjqf7sx5dzmj            0 SELECT count(*) FROM t            ALL_ROWS                1
5tjqf7sx5dzmj            1 SELECT count(*) FROM t            FIRST_ROWS              2

Oh, damn! Even though the OPTIMIZER MODE is set to a different value the same child cursor is used. Since in this particular situation the execution plans associated to both child cursors are the same (their hash value are equal), it’s not a real problem. But, in practice, it might be possible that two different optimizer modes lead to different execution plans. The following example illustrates this.

  • Build a table for the test:

SQL> CREATE TABLE t AS
  2  SELECT rownum AS id, rpad('*',500,'*') AS pad
  3  FROM dual
  4  CONNECT BY level <= 1000;

SQL> CREATE UNIQUE INDEX i ON t (id);

SQL> execute dbms_stats.gather_table_stats(user, 'T')

  • Show that different values of the OPTIMIZER_MODE parameter lead to different execution plans:

SQL> ALTER SESSION SET optimizer_mode = FIRST_ROWS_1;

SQL> EXPLAIN PLAN FOR SELECT * FROM t WHERE id <= 500;

SQL> SELECT * FROM table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
Plan hash value: 242607798

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     3 |  1515 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T    |     3 |  1515 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | I    |       |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"<=500)

SQL> ALTER SESSION SET optimizer_mode = FIRST_ROWS_1000;

SQL> EXPLAIN PLAN FOR SELECT * FROM t WHERE id <= 500;

SQL> SELECT * FROM table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   500 |   246K|    10   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |   500 |   246K|    10   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ID"<=500)

  • Execute the test query with both values of the OPTIMIZER_MODE parameter:

SQL> ALTER SYSTEM FLUSH SHARED_POOL;

SQL> ALTER SESSION SET optimizer_mode = FIRST_ROWS_1;

SQL> SELECT * FROM t WHERE id <= 500;

        ID PAD
---------- ----------
         1 **********
         2 **********
…
       499 **********
       500 **********

SQL> ALTER SESSION SET optimizer_mode = FIRST_ROWS_1000;

SQL> SELECT * FROM t WHERE id <= 500;

        ID PAD
---------- ----------
         1 **********
         2 **********
…
       499 **********
       500 **********

  • Show that a single execution plan was used for both executions:

SQL> SELECT * FROM table(dbms_xplan.display_cursor(NULL,NULL));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
SQL_ID  2vw03p929jzgz, child number 0
-------------------------------------
SELECT * FROM t WHERE id <= 500

Plan hash value: 242607798

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |       |       |     3 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T    |     3 |  1515 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | I    |       |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"<=500)

SQL> SELECT sql_id, child_number, executions, optimizer_mode
  2  FROM v$sql
  3  WHERE sql_id = '2vw03p929jzgz';

SQL_ID        CHILD_NUMBER EXECUTIONS OPTIMIZER_MODE
------------- ------------ ---------- --------------
2vw03p929jzgz            0          2 FIRST_ROWS

Even though it is not very likely that this bug (yes, in my opinion something like this cannot be considered a restriction of the implementation…) has an impact on a production system, I really don’t understand why the developers didn’t implement it correctly. It should not be that difficult to manage a byte containing the information about the used optimizer mode! Note that this is not the only case where something like that happens with the first rows optimizer mode. For example, also in a trace file generated through SQL trace no difference is made between the old and the new first row optimizer. So, it seams that they really got it wrong.


May 04 2010

Native Full Outer Join Officially Available in 10.2.0.5

Tag: 10gR2, Query OptimizerChristian Antognini @ 12:25 pm

Today I installed for the first time the patchset 10.2.0.5. While reading the README file, I noticed the following piece of information.

To enable a new native full outer join implementation in the database, a user has to set the following underscore parameter:

_optimizer_native_full_outer_join =force

You can set this parameter for the system or for a specific session.

Besides dramatically improving the performance of a full outer join, the new implementation fixes a variety of issues, for examples a variety of ORA-942 (table or view doesn’t exists) and ORA-4331 (unable to allocate string bytes of shared memory) errors.

This issue is tracked with Oracle bug 6322672.

Great! At last we can officially take advantage of native full outer join also in 10.2 (the feature was officially introduced in 11.1, but was already “available” in 10.2.0.3).

Here is an example:

  • By default native full outer joins are disabled (notice the implementation with the UNION ALL operation):

SQL> SELECT * FROM emp FULL OUTER JOIN dept USING (deptno);

Execution Plan
----------------------------------------------------------
Plan hash value: 2291915024

---------------------------------------------
| Id  | Operation            | Name         |
---------------------------------------------
|   0 | SELECT STATEMENT     |              |
|   1 |  VIEW                |              |
|   2 |   UNION-ALL          |              |
|*  3 |    HASH JOIN OUTER   |              |
|   4 |     TABLE ACCESS FULL| EMP          |
|   5 |     TABLE ACCESS FULL| DEPT         |
|   6 |    NESTED LOOPS ANTI |              |
|   7 |     TABLE ACCESS FULL| DEPT         |
|*  8 |     INDEX RANGE SCAN | EMP_DEPTNO_I |
---------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("EMP"."DEPTNO"="DEPT"."DEPTNO"(+))
   8 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")

  • As suggested by the README file, the feature can be enabled at the session level:

SQL> ALTER SESSION SET "_optimizer_native_full_outer_join" = force;

SQL> SELECT * FROM emp FULL OUTER JOIN dept USING (deptno);

Execution Plan
----------------------------------------------------------
Plan hash value: 51889263

------------------------------------------
| Id  | Operation             | Name     |
------------------------------------------
|   0 | SELECT STATEMENT      |          |
|   1 |  VIEW                 | VW_FOJ_0 |
|*  2 |   HASH JOIN FULL OUTER|          |
|   3 |    TABLE ACCESS FULL  | DEPT     |
|   4 |    TABLE ACCESS FULL  | EMP      |
------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")


« Previous PageNext Page »