Feb 28 2010

Tracing VPD Predicates

Tag: 10gR1, 10gR2, 11gR1, 11gR2, 9iR2Christian Antognini @ 12:30 pm

Even though a number of articles and blog posts have already been written on this topic (e.g. on Pete Finnigan’s site I found references dating back from 2003), from time to time I’m still asked “How to trace predicates generated by VPD?”. Hence, here’s yet another blog post about this topic…

Let’s setup the scene before explaining how you can do it:

  • The user named CHA owns the schema created with the script ?/sqlplus/demo/demobld.sql.
  • The data stored in the EMP table is the following:

SQL> SELECT * FROM emp;

     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
      7369 SMITH      CLERK           7902 17-DEC-80        800                    20
      7499 ALLEN      SALESMAN        7698 20-FEB-81       1600        300         30
      7521 WARD       SALESMAN        7698 22-FEB-81       1250        500         30
      7566 JONES      MANAGER         7839 02-APR-81       2975                    20
      7654 MARTIN     SALESMAN        7698 28-SEP-81       1250       1400         30
      7698 BLAKE      MANAGER         7839 01-MAY-81       2850                    30
      7782 CLARK      MANAGER         7839 09-JUN-81       2450                    10
      7788 SCOTT      ANALYST         7566 09-DEC-82       3000                    20
      7839 KING       PRESIDENT            17-NOV-81       5000                    10
      7844 TURNER     SALESMAN        7698 08-SEP-81       1500          0         30
      7876 ADAMS      CLERK           7788 12-JAN-83       1100                    20
      7900 JAMES      CLERK           7698 03-DEC-81        950                    30
      7902 FORD       ANALYST         7566 03-DEC-81       3000                    20
      7934 MILLER     CLERK           7782 23-JAN-82       1300                    10

  • The data stored in the EMP table is protected by a VPD predicate created with the following commands:

SQL> CREATE OR REPLACE FUNCTION emp_restrict (p_schema IN VARCHAR2, p_table IN VARCHAR2) RETURN VARCHAR2 AS
  2  BEGIN
  3    RETURN '''' || sys_context('userenv','session_user') || ''' = ename';
  4  END emp_restrict;
  5  /

SQL> execute dbms_rls.add_policy('CHA','EMP','EMP_POLICY','CHA','EMP_RESTRICT');

  • Because of the VPD predicate, different users see different rows. Here an example:

SQL> connect scott

SQL> SELECT * FROM cha.emp;

     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
      7788 SCOTT      ANALYST         7566 09-DEC-82       3000                    20

SQL> connect clark

SQL> SELECT * FROM cha.emp;

     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
      7782 CLARK      MANAGER         7839 09-JUN-81       2450                    10

If the function used for the VPD policy is simple and does generate a predicate that can be correctly parsed, to view the predicate it is enough to give a look to the output of the dbms_xplan package. The following SQL statements illustrate this:

SQL> SELECT * FROM table(dbms_xplan.display_cursor(sql_id=>'dmc3z4t0u57y1', format=>'basic predicate'));

EXPLAINED SQL STATEMENT:
------------------------
SELECT * FROM cha.emp

Plan hash value: 3956160932

----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|*  1 |  TABLE ACCESS FULL| EMP  |
----------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ENAME"='SCOTT')

Another possibility, if you have access to the V$VPD_POLICY view, is to execute a query like the following one:

SQL> SELECT predicate FROM v$vpd_policy WHERE sql_id = 'dmc3z4t0u57y1';

PREDICATE
----------------------
'SCOTT' = ename

However, in case you want to see the whole SQL statement or the predicate generated by the VPD policy leads to an ORA-28113 (policy predicate has error), there is no documented way I’m aware of to display the generated predicate. One of the undocumented ways to do it is to use the event 10730. Note that the event generates a trace file containing the information we are looking for in such situations. Here is an example:

  • Enable the event:

SQL> ALTER SESSION SET events '10730 trace name context forever, level 1';

  • Run the query that, for example, leads to an error (note that to generate the ORA-28113 I modified the EMP_RESTRICT function…):

SQL> SELECT * FROM cha.emp;
SELECT * FROM cha.emp
                  *
ERROR at line 1:
ORA-28113: policy predicate has error

  • Check the trace file generated by the database engine (note that the V$DIAG_INFO view is available as of 11g only):

SQL> SELECT value FROM v$diag_info WHERE name = 'Default Trace File';

VALUE
---------------------------------------------------------------------
/u00/app/oracle/diag/rdbms/dba112/DBA112/trace/DBA112_ora_31964.trc

SQL> host tail -10 /u00/app/oracle/diag/rdbms/dba112/DBA112/trace/DBA112_ora_31964.trc
-------------------------------------------------------------
Error information for ORA-28113:
Logon user     : SCOTT
Table/View     : CHA.EMP
Policy name    : EMP_POLICY
Policy function: CHA.EMP_RESTRICT
RLS view  :
SELECT  "EMPNO","ENAME","JOB","MGR","HIREDATE","SAL","COMM","DEPTNO" FROM "CHA"."EMP"   "EMP" WHERE ('SCOTT' = enamee)
ORA-00904: "ENAMEE": invalid identifier
-------------------------------------------------------------

As you can see, the trace file contains not only the whole SQL statement but also the reason for the ORA-28113 error.


Jan 26 2010

Does the Query Optimizer Cost PX Distribution Methods?

Tag: 10gR1, 10gR2, 11gR1, 11gR2, 9iR2, Parallel Processing, Query OptimizerChristian Antognini @ 12:55 pm

The short answer to this question is “yes”, it does. Unfortunately, the distribution costs are not externalized through the execution plans and, as a result, this limitation (yes, it is really a limitation in the current implementation, not a bug) confuses everyone that carefully look at the information provided in an execution plan of a SQL statement executed in parallel. Hence, let’s remove some confusion…

To illustrate what the problem is, let’s have a look to a simple query that joins two tables:

SELECT * FROM master m JOIN detail d ON (m.id = d.id)

Now, let’s have a look at two parallel executions. If the two tables are equipartitioned, the following execution plan (which takes advantage of partition-wise join) is probably the most effective for such a query. Note that thanks to the partition-wise join not only there is a single set of parallel slaves (Q1,00), but, in addition, the parallel slaves do not communicate with each other (they only communicate with the query coordinator). As a result, the communication costs are equal to zero (this is because the query optimizer does not compute the costs of the communication towards the query coordinator).

----------------------------------------------------------------------------------------------
| Id  | Operation               | Name     | Bytes | Cost (%CPU)|    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |          |    16G| 162524  (1)|        |      |            |
|   1 |  PX COORDINATOR         |          |       |            |        |      |            |
|   2 |   PX SEND QC (RANDOM)   | :TQ10000 |    16G| 162524  (1)|  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX PARTITION HASH ALL|          |    16G| 162524  (1)|  Q1,00 | PCWC |            |
|   4 |     HASH JOIN           |          |    16G| 162524  (1)|  Q1,00 | PCWP |            |
|   5 |      TABLE ACCESS FULL  | MASTER   |   125M|   1422  (1)|  Q1,00 | PCWP |            |
|   6 |      TABLE ACCESS FULL  | DETAIL   |    15G| 161052  (1)|  Q1,00 | PCWP |            |
----------------------------------------------------------------------------------------------

If the two tables are not equipartitioned, the following execution plan might be chosen by the query optimizer. Since it does not take advantage of a partition-wise join, several set of parallel slaves are used. The first one (Q1,00) scans the MASTER table, the second one (Q1,01) scans the DETAIL table, and both of them send the data to the third one (Q1,02) that performs the join of the two tables and sends the data to the query coordinator. Since all data (about 15GB; yes, the estimations are good) is sent through the PX channels, the cost should not be zero. However, as you can see, the cost is exactly the same as the one of the previous execution plan.

----------------------------------------------------------------------------------------------
| Id  | Operation               | Name     | Bytes | Cost (%CPU)|    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |          |    16G| 162524  (1)|        |      |            |
|   1 |  PX COORDINATOR         |          |       |            |        |      |            |
|   2 |   PX SEND QC (RANDOM)   | :TQ10002 |    16G| 162524  (1)|  Q1,02 | P->S | QC (RAND)  |
|   3 |    HASH JOIN BUFFERED   |          |    16G| 162524  (1)|  Q1,02 | PCWP |            |
|   4 |     PX RECEIVE          |          |   125M|   1422  (1)|  Q1,02 | PCWP |            |
|   5 |      PX SEND HASH       | :TQ10000 |   125M|   1422  (1)|  Q1,00 | P->P | HASH       |
|   6 |       PX BLOCK ITERATOR |          |   125M|   1422  (1)|  Q1,00 | PCWC |            |
|   7 |        TABLE ACCESS FULL| MASTER   |   125M|   1422  (1)|  Q1,00 | PCWP |            |
|   8 |     PX RECEIVE          |          |    15G| 161052  (1)|  Q1,02 | PCWP |            |
|   9 |      PX SEND HASH       | :TQ10001 |    15G| 161052  (1)|  Q1,01 | P->P | HASH       |
|  10 |       PX BLOCK ITERATOR |          |    15G| 161052  (1)|  Q1,01 | PCWC |            |
|  11 |        TABLE ACCESS FULL| DETAIL   |    15G| 161052  (1)|  Q1,01 | PCWP |            |
----------------------------------------------------------------------------------------------

For completeness, let’s compare the cost of several distribution methods (“none-none” is the one of the first execution plan above, “hash-hash” of the second one). As you can see the cost is always the same!

SQL> EXPLAIN PLAN SET STATEMENT_ID 'none-none' FOR SELECT /*+ pq_distribute(d none none) */ * FROM master m JOIN detail d ON (m.id = d.id);
SQL> EXPLAIN PLAN SET STATEMENT_ID 'hash-hash' FOR SELECT /*+ pq_distribute(d hash hash) */ * FROM master m JOIN detail d ON (m.id = d.id);
SQL> EXPLAIN PLAN SET STATEMENT_ID 'broadcast-none' FOR SELECT /*+ pq_distribute(d broadcast none) */ * FROM master m JOIN detail d ON (m.id = d.id);
SQL> EXPLAIN PLAN SET STATEMENT_ID 'none-broadcast' FOR SELECT /*+ pq_distribute(d none broadcast) */ * FROM master m JOIN detail d ON (m.id = d.id);
SQL> EXPLAIN PLAN SET STATEMENT_ID 'partition-none' FOR SELECT /*+ pq_distribute(d partition none) */ * FROM master m JOIN detail d ON (m.id = d.id);
SQL> EXPLAIN PLAN SET STATEMENT_ID 'none-partition' FOR SELECT /*+ pq_distribute(d none partition) */ * FROM master m JOIN detail d ON (m.id = d.id);
SQL> SELECT statement_id, cost FROM plan_table WHERE id = 0;

STATEMENT_ID                         COST
------------------------------ ----------
none-none                          162524
hash-hash                          162524
broadcast-none                     162524
none-broadcast                     162524
partition-none                     162524
none-partition                     162524

As I wrote before, the problem is not that the costs are not computed. The problem is that they are not externalized. In fact, by giving a look to a trace file generated through the event 10053 the costs are available. Here’s the relevant part (the lines starting with “---- cost” contain the most important information). As you can see there are two costs associated with every distribution method: one with the distribution costs (w/ dist) and one without them (w/o dist).

Enumerating distribution method for join between M[MASTER] and D[DETAIL]
-- Using join method #Hash Join:
---- cost NONE = 0.00
  Outer table:  MASTER  Alias: M
    resc: 5120.11  card 4118000.00  bytes: 32  deg: 4  resp: 1422.25
  Inner table:  DETAIL  Alias: D
    resc: 579787.63  card: 31954000.00  bytes: 526  deg: 4  resp: 161052.12
    using dmeth: 513  #groups: 1
    Cost per ptn: 49.68  #ptns: 4
    hash_area: 16384 (max=16384) buildfrag: 5530  probefrag: 524636  ppasses: 1
      buildfrag: 5530  probefrag: 524636  passes: 1
  Hash join: Resc: 585106.45  Resp: 162524.05  [multiMatchCost=0.00]
---- cost(Hash Join) = 162524.05 (w/o dist), 162524.05 (w/ dist)
---- cost VALUE = 278.52
---- cost with slave mapping  =   Outer table:  MASTER  Alias: M
    resc: 5120.11  card 4118000.00  bytes: 32  deg: 4  resp: 1422.25
  Inner table:  DETAIL  Alias: D
    resc: 579787.63  card: 31954000.00  bytes: 526  deg: 4  resp: 161052.12
    using dmeth: 2  #groups: 1
    Cost per ptn: 49.68  #ptns: 4
    hash_area: 16384 (max=16384) buildfrag: 5530  probefrag: 524636  ppasses: 1
      buildfrag: 5530  probefrag: 524636  passes: 1
  Hash join: Resc: 585106.45  Resp: 162524.05  [multiMatchCost=0.00]
---- cost(Hash Join) = 162524.05 (w/o dist), 162802.57 (w/ dist)
---- cost PARTITION-RIGHT = 271.40
  Outer table:  MASTER  Alias: M
    resc: 5120.11  card 4118000.00  bytes: 32  deg: 4  resp: 1422.25
  Inner table:  DETAIL  Alias: D
    resc: 579787.63  card: 31954000.00  bytes: 526  deg: 4  resp: 161052.12
    using dmeth: 576  #groups: 1
    Cost per ptn: 49.68  #ptns: 4
    hash_area: 16384 (max=16384) buildfrag: 5530  probefrag: 524636  ppasses: 1
      buildfrag: 5530  probefrag: 524636  passes: 1
  Hash join: Resc: 585106.45  Resp: 162524.05  [multiMatchCost=0.00]
---- cost(Hash Join) = 162524.05 (w/o dist), 162795.46 (w/ dist)
---- cost PARTITION-LEFT = 7.12
  Outer table:  MASTER  Alias: M
    resc: 5120.11  card 4118000.00  bytes: 32  deg: 4  resp: 1422.25
  Inner table:  DETAIL  Alias: D
    resc: 579787.63  card: 31954000.00  bytes: 526  deg: 4  resp: 161052.12
    using dmeth: 544  #groups: 1
    Cost per ptn: 49.68  #ptns: 4
    hash_area: 16384 (max=16384) buildfrag: 5530  probefrag: 524636  ppasses: 1
      buildfrag: 5530  probefrag: 524636  passes: 1
  Hash join: Resc: 585106.45  Resp: 162524.05  [multiMatchCost=0.00]
---- cost(Hash Join) = 162524.05 (w/o dist), 162531.17 (w/ dist)
---- cost BROADCAST-RIGHT = 920.78
---- cost with slave mapping  =   Outer table:  MASTER  Alias: M
    resc: 5120.11  card 4118000.00  bytes: 32  deg: 4  resp: 1422.25
  Inner table:  DETAIL  Alias: D
    resc: 579787.63  card: 31954000.00  bytes: 526  deg: 4  resp: 161052.12
    using dmeth: 8  #groups: 4
    Cost per ptn: 49.68  #ptns: 4
    hash_area: 16384 (max=16384) buildfrag: 5530  probefrag: 524636  ppasses: 1
      buildfrag: 5530  probefrag: 524636  passes: 1
  Hash join: Resc: 585106.45  Resp: 162524.05  [multiMatchCost=0.00]
---- cost(Hash Join) = 162524.05 (w/o dist), 162755.25 (w/ dist)
---- cost BROADCAST-LEFT = 7.22
---- cost with slave mapping  =   Outer table:  MASTER  Alias: M
    resc: 5120.11  card 4118000.00  bytes: 32  deg: 4  resp: 1422.25
  Inner table:  DETAIL  Alias: D
    resc: 579787.63  card: 31954000.00  bytes: 526  deg: 4  resp: 161052.12
    using dmeth: 16  #groups: 4
    Cost per ptn: 49.68  #ptns: 4
    hash_area: 16384 (max=16384) buildfrag: 5530  probefrag: 524636  ppasses: 1
      buildfrag: 5530  probefrag: 524636  passes: 1
  Hash join: Resc: 585106.45  Resp: 162524.05  [multiMatchCost=0.00]
---- cost(Hash Join) = 162524.05 (w/o dist), 162526.86 (w/ dist)

Since the cost are (correctly) computed, the query optimizer is able to choose the optimal plan. However, it would be nice to have the actual costs in the execution plans.


Jan 11 2010

Join Elimination

Tag: 10gR2, 11gR1, 11gR2, Query OptimizerChristian Antognini @ 6:50 pm

In some specific situations the query optimizer is able to completely avoid executing a join even if a SQL statement explicitly calls for it. Two are the cases currently covered by this optimization technique, which is called join elimination. The first one was introduced in Oracle Database 10g Release 2, the second one in Oracle Database 11g Release 2. Let’s take a look at two cases to illustrate how join elimination works.

Case #1

Up to Oracle Database 11g Release 1 join elimination is especially useful when views containing joins are used. Note, however, that join elimination does not work only with views. It can be applied to SQL statements without views as well. The following SQL statements define two tables and one view. Notice that between table T1 and table T2, there is a master-child relationship. In fact, table T2, with its column T1_ID, references the primary key of table T1.

SQL> CREATE TABLE t1 (
  2    id NUMBER NOT NULL,
  3    n NUMBER,
  4    pad VARCHAR2(4000),
  5    CONSTRAINT t1_pk PRIMARY KEY(id)
  6  );

SQL> CREATE TABLE t2 (
  2    id NUMBER NOT NULL,
  3    t1_id NUMBER NOT NULL,
  4    n NUMBER,
  5    pad VARCHAR2(4000),
  6    CONSTRAINT t2_pk PRIMARY KEY(id),
  7    CONSTRAINT t2_t1_fk FOREIGN KEY (t1_id) REFERENCES t1
  8  );

SQL> CREATE VIEW v AS
  2  SELECT t1.id AS t1_id, t1.n AS t1_n, t2.id AS t2_id, t2.n AS t2_n
  3  FROM t1, t2
  4  WHERE t1.id = t2.t1_id;

When all the columns are referenced, as shown in the following example, the join is regularly executed. No surprise here.

SQL> EXPLAIN PLAN FOR SELECT * FROM v;

SQL> SELECT * FROM table(dbms_xplan.display(NULL,NULL,'basic'));

PLAN_TABLE_OUTPUT
----------------------------------------------
Plan hash value: 3114288414

----------------------------------------------
| Id  | Operation                    | Name  |
----------------------------------------------
|   0 | SELECT STATEMENT             |       |
|   1 |  NESTED LOOPS                |       |
|   2 |   NESTED LOOPS               |       |
|   3 |    TABLE ACCESS FULL         | T2    |
|   4 |    INDEX UNIQUE SCAN         | T1_PK |
|   5 |   TABLE ACCESS BY INDEX ROWID| T1    |
----------------------------------------------

However, as illustrated in the next example, when only columns defined in the child table are referenced, the query optimizer is able to eliminate the join. It can do so because there is a validated foreign key constraint that guarantees that all rows in table T2 reference one row in table T1.

SQL> EXPLAIN PLAN FOR SELECT t2_id, t2_n FROM v;

PLAN_TABLE_OUTPUT
----------------------------------
Plan hash value: 1513984157

----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|   1 |  TABLE ACCESS FULL| T2   |
----------------------------------

The relevant part of the output of event 10053 is the following (notice that two queries are shown; the one before the transformation and the one after the transformation):

JE:   Considering Join Elimination on query block SEL$2 (#0)
*************************
Join Elimination (JE)
*************************
SQL:******* UNPARSED QUERY IS *******
SELECT "T2"."ID" "T2_ID","T2"."N" "T2_N" FROM CHA."T1" "T1",CHA."T2" "T2" WHERE "T1"."ID"="T2"."T1_ID"
JE:   cfro: T2 objn:86871 col#:2 dfro:T1 dcol#:2
JE:   cfro: T2 objn:86871 col#:2 dfro:T1 dcol#:2
Query block (0x2b732c78) before join elimination:
SQL:******* UNPARSED QUERY IS *******
SELECT "T2"."ID" "T2_ID","T2"."N" "T2_N" FROM CHA."T1" "T1",CHA."T2" "T2" WHERE "T2"."T1_ID"="T1"."ID"
JE:   eliminate table: T1 (T1)
Registered qb: SEL$FFBD8603 0x2b732c78 (JOIN REMOVED FROM QUERY BLOCK SEL$2; SEL$2; "T1"@"SEL$2")
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$FFBD8603 nbfros=1 flg=0
    fro(0): flg=0 objn=86873 hint_alias="T2"@"SEL$2"

SQL:******* UNPARSED QUERY IS *******
SELECT "T2"."ID" "T2_ID","T2"."N" "T2_N" FROM CHA."T2" "T2"
Query block SEL$FFBD8603 (#0) simplified

Case #2

As of Oracle Database 11g Release 2 join elimination covers another case. Its aim is to avoid the execution of “unnecessary” self-joins. The following SQL statements show an example. Notice that since the join is performed on the primary key (column ID) there is no need to access the table twice. In fact, it is possible to replace the references to the eliminated table (T2 in this example) in the SELECT clause with columns of the table that is not eliminated (T1).

SQL> EXPLAIN PLAN FOR SELECT t11.*, t12.* FROM t1 t11, t1 t12 WHERE t11.id = t12.id;

SQL> SELECT * FROM table(dbms_xplan.display(NULL,NULL,'basic'));

PLAN_TABLE_OUTPUT
----------------------------------
Plan hash value: 3617692013

----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|   1 |  TABLE ACCESS FULL| T1   |
----------------------------------

The relevant part of the output of event 10053 is the following (also in this case notice that there are two queries):

JE:   Considering Join Elimination on query block SEL$1 (#0)
*************************
Join Elimination (JE)
*************************
SQL:******* UNPARSED QUERY IS *******
SELECT "T11"."ID" "ID","T11"."N" "N","T11"."PAD" "PAD","T12"."ID" "ID","T12"."N" "N","T12"."PAD" "PAD" FROM "CHA"."T1" "T11","CHA"."T1" "T12" WHERE "T11"."ID"="T12"."ID"
JE:   cfro: T1 objn:86871 col#:1 dfro:T1 dcol#:1
JE:   cfro: T1 objn:86871 col#:1 dfro:T1 dcol#:1
JE:   cfro: T1 objn:86871 col#:1 dfro:T1 dcol#:1
JE:   cfro: T1 objn:86871 col#:1 dfro:T1 dcol#:1
Query block (0x2c14f098) before join elimination:
SQL:******* UNPARSED QUERY IS *******
SELECT "T11"."ID" "ID","T11"."N" "N","T11"."PAD" "PAD","T12"."ID" "ID","T12"."N" "N","T12"."PAD" "PAD" FROM "CHA"."T1" "T11","CHA"."T1" "T12" WHERE "T11"."ID"="T12"."ID"
JE:   eliminate table: T1 (T12)
JE:   Replaced column: T12.PAD with column: T11.PAD
JE:   Replaced column: T12.N with column: T11.N
JE:   Replaced column: T12.ID with column: T11.ID
Registered qb: SEL$DF69B110 0x2c14f098 (JOIN REMOVED FROM QUERY BLOCK SEL$1; SEL$1; "T12"@"SEL$1")
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$DF69B110 nbfros=1 flg=0
    fro(0): flg=0 objn=86871 hint_alias="T11"@"SEL$1"

SQL:******* UNPARSED QUERY IS *******
SELECT "T11"."ID" "ID","T11"."N" "N","T11"."PAD" "PAD","T11"."ID" "ID","T11"."N" "N","T11"."PAD" "PAD" FROM "CHA"."T1" "T11"
Query block SEL$DF69B110 (#0) simplified

Note that running the previous example in Oracle Database 11g Release 1 or earlier leads, as expected, to a join like the following one.

SQL> EXPLAIN PLAN FOR SELECT t11.*, t12.* FROM t1 t11, t1 t12 WHERE t11.id = t12.id;

SQL> SELECT * FROM table(dbms_xplan.display(NULL,NULL,'basic'));

PLAN_TABLE_OUTPUT
----------------------------------------------
Plan hash value: 774821007

----------------------------------------------
| Id  | Operation                    | Name  |
----------------------------------------------
|   0 | SELECT STATEMENT             |       |
|   1 |  NESTED LOOPS                |       |
|   2 |   NESTED LOOPS               |       |
|   3 |    TABLE ACCESS FULL         | T1    |
|   4 |    INDEX UNIQUE SCAN         | T1_PK |
|   5 |   TABLE ACCESS BY INDEX ROWID| T1    |
----------------------------------------------


Dec 17 2009

Does CREATE INDEX Gather Global Statistics?

Tag: 10gR1, 10gR2, 11gR1, 11gR2, 9iR2, Indexes, Object Statistics, PartitioningChristian Antognini @ 9:54 am

You can add the COMPUTE STATISTICS clause to the CREATE INDEX statement. It instructs the SQL statement to gather and store index statistics in the data dictionary, while creating the index. This is useful because the overhead associated with the gathering of statistics while executing this SQL statement is negligible. In Oracle9i, the gathering of statistics is performed only when this clause is specified. As of Oracle Database 10g, whenever statistics are not locked, their gathering is done by default, which means the COMPUTE STATISTICS clause is deprecated and available for backward compatibility only.

Unfortunately, CREATE INDEX does not gather global statistics. As a result, whenever you are creating partitioned indexes, the global statistics might be inaccurate. Let me show you an example:

  • Create partitioned table, insert data (notice that the number of distinct values is equal to the number of rows) and create a local index

SQL> CREATE TABLE t (n1 number, n2 number)
  2  PARTITION BY RANGE (n1) (
  3    PARTITION p1 VALUES LESS THAN (11),
  4    PARTITION p2 VALUES LESS THAN (21)
  5  );

Table created.

SQL> INSERT INTO t
  2  SELECT rownum, rownum
  3  FROM dual
  4  CONNECT BY level <= 20;

20 rows created.

SQL> CREATE INDEX i ON t (n2) LOCAL;

Index created.

  • The CREATE INDEX statement gathered the statistics for the index; let’s check them…

SQL> SELECT partition_name, global_stats, distinct_keys
  2  FROM user_ind_statistics
  3  WHERE index_name = 'I';

PARTITION_NAME GLOBAL_STATS DISTINCT_KEYS
-------------- ------------ -------------
               NO                      10
P1             NO                      10
P2             NO                      10

As you can see 1) the number of distinct keys at the global level is wrong; it should be 20! 2) the GLOBAL_STATS column at the index level is set to NO. As a result, when you create a partitioned index, you should manually gather the global index statistics straight after. In other words, you should do the following:

  • Manually gather global level index statistics

SQL> execute dbms_stats.gather_index_stats(ownname=>user, indname=>'i', granularity=>'global')

PL/SQL procedure successfully completed.

  • Check whether the index statistics are accurate

SQL> SELECT partition_name, global_stats, distinct_keys
  2  FROM user_ind_statistics
  3  WHERE index_name = 'I';

PARTITION_NAME GLOBAL_STATS DISTINCT_KEYS
-------------- ------------ -------------
               YES                     20
P1             NO                      10
P2             NO                      10

There are situations, however, where it is not necessary to manually gather the global index statistics. For example, when the index is prefixed. But, as a general rule, I would not rely on the automatically gathered statistics for partitioned indexes.


Oct 23 2009

Hints for Direct-path Insert Statements

Tag: 10gR1, 10gR2, 11gR1, 11gR2, 9iR2, Direct PathChristian Antognini @ 5:28 pm

Up to Oracle Database 10g Release 2, direct-path inserts are supported only by INSERT INTO … SELECT … statements (including multitable inserts), MERGE statements (for the part inserting data), and applications using the OCI direct-path interface (for example, the SQL*Loader utility). At the statement level two methods are available to specify that a direct-path insert has to be used:

  • Specify the APPEND hint in the SQL statement
  • Execute the SQL statement (actually, at least the INSERT part) in parallel

Let’s have a look to an example. Notice that:

  • The APPEND hint is used to execute a direct-path insert.
  • The APPEND hint does not work with a “regular” INSERT statement that uses the VALUES clause.
  • To check whether the direct-path insert is performed, the modified table is queried without committing (or rolling back) the transaction. As a result, after a direct-path insert the database engine raises an ORA-12838.

SQL> SELECT * FROM v$version WHERE rownum = 1;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi

SQL> CREATE TABLE t (n NUMBER);

SQL> INSERT /*+ append */ INTO t SELECT 1 FROM dual;

SQL> SELECT * FROM t;
SELECT * FROM t
              *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

SQL> COMMIT;

SQL> INSERT /*+ append */ INTO t VALUES (2);

SQL> SELECT * FROM t;

         N
----------
         1
         2

Strangely enough, at least for me, in Oracle Database 11g Release 1 the behavior of the APPEND hint has changed. In fact, it is accepted also for a “regular” INSERT statement that uses the VALUES clause. Let’s run the same test as before to illustrate the new behavior.

SQL> SELECT * FROM v$version WHERE rownum = 1;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production

SQL> CREATE TABLE t (n NUMBER);

SQL> INSERT /*+ append */ INTO t SELECT 1 FROM dual;

SQL> SELECT * FROM t;
SELECT * FROM t
              *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

SQL> COMMIT;

SQL> INSERT /*+ append */ INTO t VALUES (1);

SQL> SELECT * FROM t;
SELECT * FROM t
              *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

Even more strange, in Oracle Database 11g Release 2 the behavior of the APPEND hint was reverted to the pre-11g one! But, since the feature is really useful in some situations, a new hint called APPEND_VALUES is available. The following example illustrates the new behavior.

SQL> SELECT * FROM v$version WHERE rownum = 1;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production

SQL> CREATE TABLE t (n NUMBER);

SQL> INSERT /*+ append */ INTO t SELECT 1 FROM dual;

SQL> SELECT * FROM t;
SELECT * FROM t
              *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

SQL> COMMIT;

SQL> INSERT /*+ append */ INTO t VALUES (2);

SQL> SELECT * FROM t;

         N
----------
         1
         2

SQL> COMMIT;

SQL> INSERT /*+ append_values */ INTO t VALUES (3);

SQL> SELECT * FROM t;
SELECT * FROM t
*
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel


« Previous PageNext Page »