Dec 17 2009

Does CREATE INDEX Gather Global Statistics?

Tag: 10gR1, 10gR2, 11gR1, 11gR2, 9iR2, Indexes, Object Statistics, PartitioningChristian Antognini @ 9:54 am

You can add the COMPUTE STATISTICS clause to the CREATE INDEX statement. It instructs the SQL statement to gather and store index statistics in the data dictionary, while creating the index. This is useful because the overhead associated with the gathering of statistics while executing this SQL statement is negligible. In Oracle9i, the gathering of statistics is performed only when this clause is specified. As of Oracle Database 10g, whenever statistics are not locked, their gathering is done by default, which means the COMPUTE STATISTICS clause is deprecated and available for backward compatibility only.

Unfortunately, CREATE INDEX does not gather global statistics. As a result, whenever you are creating partitioned indexes, the global statistics might be inaccurate. Let me show you an example:

  • Create partitioned table, insert data (notice that the number of distinct values is equal to the number of rows) and create a local index

SQL> CREATE TABLE t (n1 number, n2 number)
  2  PARTITION BY RANGE (n1) (
  3    PARTITION p1 VALUES LESS THAN (11),
  4    PARTITION p2 VALUES LESS THAN (21)
  5  );

Table created.

SQL> INSERT INTO t
  2  SELECT rownum, rownum
  3  FROM dual
  4  CONNECT BY level <= 20;

20 rows created.

SQL> CREATE INDEX i ON t (n2) LOCAL;

Index created.

  • The CREATE INDEX statement gathered the statistics for the index; let’s check them…

SQL> SELECT partition_name, global_stats, distinct_keys
  2  FROM user_ind_statistics
  3  WHERE index_name = 'I';

PARTITION_NAME GLOBAL_STATS DISTINCT_KEYS
-------------- ------------ -------------
               NO                      10
P1             NO                      10
P2             NO                      10

As you can see 1) the number of distinct keys at the global level is wrong; it should be 20! 2) the GLOBAL_STATS column at the index level is set to NO. As a result, when you create a partitioned index, you should manually gather the global index statistics straight after. In other words, you should do the following:

  • Manually gather global level index statistics

SQL> execute dbms_stats.gather_index_stats(ownname=>user, indname=>'i', granularity=>'global')

PL/SQL procedure successfully completed.

  • Check whether the index statistics are accurate

SQL> SELECT partition_name, global_stats, distinct_keys
  2  FROM user_ind_statistics
  3  WHERE index_name = 'I';

PARTITION_NAME GLOBAL_STATS DISTINCT_KEYS
-------------- ------------ -------------
               YES                     20
P1             NO                      10
P2             NO                      10

There are situations, however, where it is not necessary to manually gather the global index statistics. For example, when the index is prefixed. But, as a general rule, I would not rely on the automatically gathered statistics for partitioned indexes.


Oct 23 2009

Hints for Direct-path Insert Statements

Tag: 10gR1, 10gR2, 11gR1, 11gR2, 9iR2, Direct PathChristian Antognini @ 5:28 pm

Up to Oracle Database 10g Release 2, direct-path inserts are supported only by INSERT INTO … SELECT … statements (including multitable inserts), MERGE statements (for the part inserting data), and applications using the OCI direct-path interface (for example, the SQL*Loader utility). At the statement level two methods are available to specify that a direct-path insert has to be used:

  • Specify the APPEND hint in the SQL statement
  • Execute the SQL statement (actually, at least the INSERT part) in parallel

Let’s have a look to an example. Notice that:

  • The APPEND hint is used to execute a direct-path insert.
  • The APPEND hint does not work with a “regular” INSERT statement that uses the VALUES clause.
  • To check whether the direct-path insert is performed, the modified table is queried without committing (or rolling back) the transaction. As a result, after a direct-path insert the database engine raises an ORA-12838.

SQL> SELECT * FROM v$version WHERE rownum = 1;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi

SQL> CREATE TABLE t (n NUMBER);

SQL> INSERT /*+ append */ INTO t SELECT 1 FROM dual;

SQL> SELECT * FROM t;
SELECT * FROM t
              *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

SQL> COMMIT;

SQL> INSERT /*+ append */ INTO t VALUES (2);

SQL> SELECT * FROM t;

         N
----------
         1
         2

Strangely enough, at least for me, in Oracle Database 11g Release 1 the behavior of the APPEND hint has changed. In fact, it is accepted also for a “regular” INSERT statement that uses the VALUES clause. Let’s run the same test as before to illustrate the new behavior.

SQL> SELECT * FROM v$version WHERE rownum = 1;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production

SQL> CREATE TABLE t (n NUMBER);

SQL> INSERT /*+ append */ INTO t SELECT 1 FROM dual;

SQL> SELECT * FROM t;
SELECT * FROM t
              *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

SQL> COMMIT;

SQL> INSERT /*+ append */ INTO t VALUES (1);

SQL> SELECT * FROM t;
SELECT * FROM t
              *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

Even more strange, in Oracle Database 11g Release 2 the behavior of the APPEND hint was reverted to the pre-11g one! But, since the feature is really useful in some situations, a new hint called APPEND_VALUES is available. The following example illustrates the new behavior.

SQL> SELECT * FROM v$version WHERE rownum = 1;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production

SQL> CREATE TABLE t (n NUMBER);

SQL> INSERT /*+ append */ INTO t SELECT 1 FROM dual;

SQL> SELECT * FROM t;
SELECT * FROM t
              *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

SQL> COMMIT;

SQL> INSERT /*+ append */ INTO t VALUES (2);

SQL> SELECT * FROM t;

         N
----------
         1
         2

SQL> COMMIT;

SQL> INSERT /*+ append_values */ INTO t VALUES (3);

SQL> SELECT * FROM t;
SELECT * FROM t
*
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel


Aug 03 2009

A-Rows and DML Statements – Part 2

Tag: 10gR1, 10gR2, 11gR1, 9iR2Christian Antognini @ 1:52 pm

In the first post about this topic I wrote: “What I don’t like about the column “A-Rows” (or the underlying columns LAST_OUTPUT_ROWS in the V$ views), is that for the operations modifying a table 0 is shown. By the way, according to the documentation it is not a bug.”

What I forgot to mention is that depending on how you interpret the documentation, another behavior might also be seen as buggy. In fact, in the documentation you can read the following description for the column LAST_OUTPUT_ROWS of the view V$SQL_PLAN_STATISTICS: “Number of rows produced by the row source, during the last execution”. The question is: what does “produced” in this context mean?

According to the interpretation given by Oracle, for “top-level” operations (e.g. SELECT, UPDATE, DELETE, …) it’s the number of rows retrieved through a SELECT statement.

Another interpretation could be the number of rows returned through any SQL statement. And that, independently on how the rows are produced. For example, because of the RETURNING clause, the UPDATE statement in the following PL/SQL block actually returns 14 rows to the PL/SQL engine. But, as you can see, also in this case 0 is shown.

SQL> DECLARE
  2    TYPE t_emp IS TABLE OF scott.emp%ROWTYPE;
  3    l_emp t_emp;
  4  BEGIN
  5    UPDATE /*+ gather_plan_statistics */ scott.emp SET sal = sal * 1.15
  6    RETURNING empno, ename, job, mgr, hiredate, sal, comm, deptno
  7    BULK COLLECT INTO l_emp;
  8  END;
  9  /

SQL> SELECT sql_id
  2  FROM v$sqlarea
  3  WHERE sql_text LIKE 'UPDATE%RETURNING%';

SQL_ID
-------------
02x15ba008dt9

SQL> SELECT *
  2  FROM table(dbms_xplan.display_cursor(sql_id=>'02x15ba008dt9', format=>'iostats last'));

SQL_ID  02x15ba008dt9, child number 0
-------------------------------------
UPDATE /*+ gather_plan_statistics */ SCOTT.EMP SET SAL = SAL * 1.15
RETURNING EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO INTO :O0
,:O1 ,:O2 ,:O3 ,:O4 ,:O5 ,:O6 ,:O7

Plan hash value: 1494045816

-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |      |      1 |        |      0 |00:00:00.01 |      23 |
|   1 |  UPDATE            | EMP  |      1 |        |      0 |00:00:00.01 |      23 |
|   2 |   TABLE ACCESS FULL| EMP  |      1 |     14 |     14 |00:00:00.01 |       7 |
-------------------------------------------------------------------------------------

Either these things that are returned cannot be called rows, or the current implementation is not very consistent. I already shared your opinion with you… I don’t like the current situation.


Jul 31 2009

A-Rows and DML Statements

Tag: 10gR1, 10gR2, 11gR1, 9iR2, Bug, TOPChristian Antognini @ 11:15 am

Today’s post is dedicated to the Metalink SR identified by the number 6468252.994. I know, this number says nothing to you. For me, however, it’s a very well known number. The reason is quite simple… Even if I open this SR almost two years ago (to be precise, September 5, 2007), it was closed few days ago. By far the most long-lasting SR I even experienced.

Let me explain why I opened it.

When assessing execution plans I like to use DBMS_XPLAN or, when necessary, to directly look at views like V$SQL_PLAN_STATISTICS and V$SQL_PLAN_STATISTICS_ALL. I like them because they provide a lot of information about what’s going on. In other words, they help me avoiding guesswork as much as possible. One of the most interesting information they provide is the number of rows returned by a given operation. DBMS_XPLAN provides this information in the column “A-Rows”. Be careful to not confuse it with the columns “Rows” and “E-Rows”. While “A-Rows” shows the actual number of rows, the other two shows the estimated number of rows. So far, so good.

What I don’t like about the column “A-Rows” (or the underlying columns LAST_OUTPUT_ROWS in the V$ views), is that for the operations modifying a table 0 is shown. By the way, according to the documentation it is not a bug. In my book I point out this behavior at page 233. For example, as the following SQL statements show, even if 14 rows are modified the value of “A-Rows” for the operation UPDATE (Id=1) is 0:

SQL> UPDATE /*+ gather_plan_statistics */ scott.emp SET sal = sal * 1.15;

14 rows updated.

SQL> SELECT * FROM table(dbms_xplan.display_cursor(format=>'iostats last'));

SQL_ID  4cs72g2hp6j67, child number 0
-------------------------------------
UPDATE /*+ gather_plan_statistics */ scott.emp SET sal = sal * 1.15

Plan hash value: 1494045816

-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |      |      1 |        |      0 |00:00:00.01 |      25 |
|   1 |  UPDATE            | EMP  |      1 |        |      0 |00:00:00.01 |      25 |
|   2 |   TABLE ACCESS FULL| EMP  |      1 |     14 |     14 |00:00:00.01 |       7 |
-------------------------------------------------------------------------------------

Back to the SR. What I tried to do with the SR is to convince Oracle that it would be much better to have the information about the number of modified rows. Unfortunately, they confirmed that the current behavior is the best one. My guess is that doing such a modification is not trivial and, therefore, they decided not doing it. In fact, also the SQL trace files have a similar behavior. For example, via SQL trace, the execution plan for the UPDATE statement shown before is the following:

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  UPDATE  EMP (cr=7 pr=0 pw=0 time=0 us)
     14   TABLE ACCESS FULL EMP (cr=7 pr=0 pw=0 time=0 us cost=3 size=56 card=14)

It is also interesting to note that while analyzing the code (probably for checking whether the change I proposed was doable) a developer discovered a bug. Based on the information I received (see bug number 6410147) it seems that in some situation the value of “A-Rows” is different than 0. But, since it is a bug, they fixed it.

Update 2009-08-03: part 2 of this post is available here.


May 05 2009

Wrong Information about Temporary Space Usage in V$SQL_PLAN_STATISTICS_ALL and DBMS_XPLAN Output

Tag: 10gR1, 10gR2, 11gR1, Bug, TOPChristian Antognini @ 1:00 am

As you can read in the documentation, the columns MAX_TEMPSEG_SIZE and LAST_TEMPSEG_SIZE in the dynamic performance view V$SQL_WORKAREA provide information about the size of the temporary segment used for a specific workarea. The values are given in bytes. Let’s perform a test to check this information…

  • Create a test table that contains about 1MB of data:

SQL> CREATE TABLE t AS
  2  SELECT rownum AS id, dbms_random.string('p',1000) AS pad
  3  FROM dual
  4  CONNECT BY level <= 1000;

SQL> execute dbms_stats.gather_table_stats(user, 't')

  • Setup the session to force the user process to spill into a temporary segment:

SQL> ALTER SESSION SET workarea_size_policy = manual;
SQL> ALTER SESSION SET sort_area_size = 524288;

  • Run test query including a sort operation (that spills to the temporary tablespace):

SQL> SELECT id FROM t ORDER BY pad;

  • Check the amount of used temporary space by querying V$SQL_WORKAREA:

SQL> SELECT max_tempseg_size, last_tempseg_size
  2  FROM v$sql_workarea
  3  WHERE (sql_id, child_number) IN (SELECT prev_sql_id, prev_child_number
  4                                   FROM v$session
  5                                   WHERE sid = sys_context('userenv','sid'));

MAX_TEMPSEG_SIZE LAST_TEMPSEG_SIZE
---------------- -----------------
         2097152           2097152

According to this information the size of the temporary space used to execute the query was 2MB. So far, so good.

Always according to the documentation another dynamic performance view, V$SQL_PLAN_STATISTICS_ALL, should provide the same information (remember, V$SQL_PLAN_STATISTICS_ALL shows in a single view all the information provided by the views V$SQL_PLAN, V$SQL_PLAN_STATISTICS, and V$SQL_WORKAREA). Let’s check it…

  • Run the same test query as before:

SQL> SELECT id FROM t ORDER BY pad;

  • Check the amount of used memory by querying V$SQL_PLAN_STATISTICS_ALL:

SQL> SELECT max_tempseg_size, last_tempseg_size
  2  FROM v$sql_plan_statistics_all
  3  WHERE (sql_id, child_number) IN (SELECT prev_sql_id, prev_child_number
  4                                   FROM v$session
  5                                   WHERE sid = sys_context('userenv','sid'))
  6  AND max_tempseg_size IS NOT NULL;

MAX_TEMPSEG_SIZE LAST_TEMPSEG_SIZE
---------------- -----------------
            2048              2048

Ups! According to this information the size of the temporary space used to execute the query was 2KB. Mhmm, something is not good… For this reason, at the end of 2007 I opened a service request about this issue. The support guy recognized the problem and opened a bug. Fine. For some unknown reasons (?) yesterday I was checking the status of few bugs. While doing so I noticed that this specific bug was closed few months ago with the status “Could Not Reproduce”! I don’t know you, but on my 64-bit Linux server I can reproduce it with at least 11.1.0.7.0, 11.1.0.6.0, 10.2.0.4.0, 10.2.0.3.0, 10.2.0.2.0, 10.2.0.1.0, 10.1.0.5.0, 10.1.0.4.0 and 10.1.0.3.0. Geez!

It is essential to note that also the package DBMS_XPLAN shows wrong information (here an example for the same query as before):

SQL> SELECT * FROM table(dbms_xplan.display_cursor(null,null,'memstats last'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------
SQL_ID  ftb71b6926dtn, child number 0
-------------------------------------
SELECT id FROM t ORDER BY pad

Plan hash value: 961378228

---------------------------------------------------------------------------------
| Id  | Operation          | Name | E-Rows |  OMem |  1Mem | Used-Mem | Used-Tmp|
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |        |       |       |          |         |
|   1 |  SORT ORDER BY     |      |   1000 |  1152K|   562K|  529K (1)|    2048 |
|   2 |   TABLE ACCESS FULL| T    |   1000 |       |       |          |         |
---------------------------------------------------------------------------------

The only good thing about the fact that Oracle is not willing to fix the bug is that my book, Troubleshooting Oracle Performance, does not need to be updated. In fact, at page 210, while describing the output of the package DBMS_XPLAN I wrote the following information:

  • Used-Tmp: The amount of temporary space used by the operation during the last execution. This value must be multiplied by 1,024 to be consistent with the other memory utilization columns (for example, 32K means 32MB).
  • Max-Tmp: The maximum amount of temporary space used by the operation. This value has to be multiplied by 1,024 to be consistent with the other memory utilization columns (for example, 32K means 32MB).

ADDENDA (Mai 6, 2009): This post was noticed by an Oracle employee and, as a result, the bug was reopened. Thank you Greg!


« Previous PageNext Page »