Aug 04 2010

Partially Index a Table

Tag: 11gR2, Indexes, PartitioningChristian Antognini @ 8:31 am

Recenty the following question was posted on oracle-l (I paraphrase…):

With Oracle Database it is possible to create something similar to Teradata’s sparse indexes?

Since the question is an interesting one, I decided to write this short post.

First of all, I have to say that such a feature is not supported by the CREATE INDEX statement with Oracle Database. What a pity! I would really like to see this feature in Oracle Database 12x (I bet that “x” will be “f”; post you opinion as a comment if you want). Several database engines do so. Teradata was mentioned by the OP, two additional examples are PostgreSQL and SQL Server.

How to do something similar with Oracle Database?

As of Oracle Database 11g Release 2 it is possible to partially index a table by taking advantage of Zero-Size Unusable Indexes. The following SQL statement shows an example where only the data of August 2010 is indexed.

SQL> CREATE TABLE t AS
  2  SELECT rownum AS id, sysdate-mod(rownum,100) AS tim, rpad('*',50,'*') AS pad
  3  FROM dual
  4  CONNECT BY level <= 1000;

SQL> CREATE INDEX i ON t (tim)
  2  GLOBAL PARTITION BY RANGE (tim) (
  3    PARTITION i_201001 VALUES LESS THAN (to_date('2010-02-01','YYYY-MM-DD')),
  4    PARTITION i_201002 VALUES LESS THAN (to_date('2010-03-01','YYYY-MM-DD')),
  5    PARTITION i_201003 VALUES LESS THAN (to_date('2010-04-01','YYYY-MM-DD')),
  6    PARTITION i_201004 VALUES LESS THAN (to_date('2010-05-01','YYYY-MM-DD')),
  7    PARTITION i_201005 VALUES LESS THAN (to_date('2010-06-01','YYYY-MM-DD')),
  8    PARTITION i_201006 VALUES LESS THAN (to_date('2010-07-01','YYYY-MM-DD')),
  9    PARTITION i_201007 VALUES LESS THAN (to_date('2010-08-01','YYYY-MM-DD')),
 10    PARTITION i_201008 VALUES LESS THAN (to_date('2010-09-01','YYYY-MM-DD')),
 11    PARTITION i_201009 VALUES LESS THAN (to_date('2010-10-01','YYYY-MM-DD')),
 12    PARTITION i_201010 VALUES LESS THAN (to_date('2010-11-01','YYYY-MM-DD')),
 13    PARTITION i_201011 VALUES LESS THAN (to_date('2010-12-01','YYYY-MM-DD')),
 14    PARTITION i_201012 VALUES LESS THAN (to_date('2011-01-01','YYYY-MM-DD')),
 15    PARTITION i_maxvalue VALUES LESS THAN (MAXVALUE)
 16  )
 17  UNUSABLE;

SQL> ALTER INDEX i REBUILD PARTITION i_201008;

It goes without saying that you are not forced to have so many partitions in place. In fact, to index the data of August, the following CREATE INDEX is more appropriate.

SQL> CREATE INDEX i ON t (tim)
  2  GLOBAL PARTITION BY RANGE (tim) (
  3    PARTITION i_201007 VALUES LESS THAN (to_date('2010-08-01','YYYY-MM-DD')),
  4    PARTITION i_201008 VALUES LESS THAN (to_date('2010-09-01','YYYY-MM-DD')),
  5    PARTITION i_maxvalue VALUES LESS THAN (MAXVALUE)
  6  )
  7  UNUSABLE;

Then, to index the data of September, you have to execute some SQL statements like the following ones.

  • Create a partition for the data of September and rebuild it:

SQL> ALTER INDEX i SPLIT PARTITION i_maxvalue AT (to_date('2010-10-01','YYYY-MM-DD')) INTO (
  2    PARTITION i_201009,
  3    PARTITION i_maxvalue
  4  );

SQL> ALTER INDEX i REBUILD PARTITION i_201009;

  • Drop the oldest partition:

SQL> ALTER INDEX i DROP PARTITION i_201007;

  • Make unusable the partition for the data of August:

SQL> ALTER INDEX i MODIFY PARTITION i_201008 UNUSABLE;

In this way you have at most three partitions available. And, for most of the time, only one of them is usable and, therefore, occupying space.

Even though in the example I provide in this post I use a global index, you can use the same technique with local indexes as well. That said, I see no problem in using a global index as the one shown in this post.


Aug 02 2010

Partition-Wise Join of List-Partitioned Tables

Tag: 10gR1, 10gR2, 11gR1, 11gR2, Partitioning, Query OptimizerChristian Antognini @ 10:59 am

When two tables are equi-partitioned on their join keys, the query optimizer is able to take advantage of partition-wise joins. To make sure that the tables are equi-partitioned, as of Oracle Database 11g reference partitioning can be used. In fact, per definition, with reference partitioning all “related” tables have exactly the same partitioning schema. If you are not using reference partitioning, you must be very careful that the tables are effectively partitioned in very same way. For range and hash partitioned tables this is usually not a problem. However, when using list partitioning, it is quite easy to make a mistake. The reason is that the partitions can be defined in any order. Let’s have a look to an example based on the following two tables.

SQL> CREATE TABLE t1p
  2  PARTITION BY LIST (pkey) (
  3    PARTITION p_0 VALUES (0),
  4    PARTITION p_1 VALUES (1),
  5    PARTITION p_2 VALUES (2),
  6    PARTITION p_3 VALUES (3),
  7    PARTITION p_4 VALUES (4),
  8    PARTITION p_5 VALUES (5),
  9    PARTITION p_6 VALUES (6),
 10    PARTITION p_7 VALUES (7),
 11    PARTITION p_8 VALUES (8),
 12    PARTITION p_9 VALUES (9)
 13  )
 14  AS
 15  SELECT rownum AS num, mod(rownum,10) AS pkey, dbms_random.string('p',50) AS pad
 16  FROM dual
 17  CONNECT BY level <= 10000;

SQL> CREATE TABLE t2p
  2  PARTITION BY LIST (pkey) (
  3    PARTITION p_0 VALUES (0),
  4    PARTITION p_1 VALUES (1),
  5    PARTITION p_2 VALUES (2),
  6    PARTITION p_3 VALUES (3),
  7    PARTITION p_5 VALUES (5),
  8    PARTITION p_4 VALUES (4),
  9    PARTITION p_6 VALUES (6),
 10    PARTITION p_7 VALUES (7),
 11    PARTITION p_8 VALUES (8),
 12    PARTITION p_9 VALUES (9)
 13  )
 14  AS
 15  SELECT rownum AS num, mod(rownum,10) AS pkey, dbms_random.string('p',50) AS pad
 16  FROM dual
 17  CONNECT BY level <= 10000;

SQL> BEGIN
  2    dbms_stats.gather_table_stats(user,'t1p');
  3    dbms_stats.gather_table_stats(user,'t2p');
  4  END;
  5  /

Even though they are logically equivalent, as shown in the following execution plan, with them partition-wise joins cannot be used.

SQL> EXPLAIN PLAN FOR SELECT * FROM t1p JOIN t2p USING (num, pkey);

SQL> SELECT * FROM table(dbms_xplan.display(format=>'basic'));

PLAN_TABLE_OUTPUT
------------------------------------

Plan hash value: 3059592055

------------------------------------
| Id  | Operation           | Name |
------------------------------------
|   0 | SELECT STATEMENT    |      |
|   1 |  HASH JOIN          |      |
|   2 |   PARTITION LIST ALL|      |
|   3 |    TABLE ACCESS FULL| T1P  |
|   4 |   PARTITION LIST ALL|      |
|   5 |    TABLE ACCESS FULL| T2P  |
------------------------------------

The difference in the order of the partitions can also be confirmed by a query like the following one.

SQL> SELECT t1p.high_value,
  2         t1p.partition_position AS pos_t1p,
  3         t2p.partition_position AS pos_t2p,
  4         decode(t1p.partition_position, t2p.partition_position, 'Y', 'N') AS equal
  5  FROM user_tab_partitions t1p JOIN user_tab_partitions t2p ON t1p.partition_name = t2p.partition_name
  6  WHERE t1p.table_name = 'T1P'
  7  AND t2p.table_name = 'T2P';

HIGH_VALUE   POS_T1P  POS_T2P EQUAL
----------- -------- -------- ------
0                  1        1 Y
1                  2        2 Y
2                  3        3 Y
3                  4        4 Y
5                  6        5 N
4                  5        6 N
6                  7        7 Y
7                  8        8 Y
8                  9        9 Y
9                 10       10 Y

It goes without saying that to solve the problem it is necessary to reorder the partitions. To do so it is enough to move the out-of-order partitions. To avoid a double storage of the data a series of ALTER TABLE EXCHANGE/DROP/ADD/EXCHANGE statements can be used.

  • Move the P5 partition of the T1P table

SQL> CREATE TABLE t1p_5 AS
  2  SELECT *
  3  FROM t1p PARTITION (p_5)
  4  WHERE 1 = 0;

SQL> ALTER TABLE t1p EXCHANGE PARTITION p_5 WITH TABLE t1p_5;

SQL> ALTER TABLE t1p DROP PARTITION p_5;

SQL> ALTER TABLE t1p ADD PARTITION p_5 VALUES (5);

SQL> ALTER TABLE t1p EXCHANGE PARTITION p_5 WITH TABLE t1p_5;

SQL> DROP TABLE t1p_5 PURGE;

  • Move the P5 partition of the T2P table

SQL> CREATE TABLE t2p_5 AS
  2  SELECT *
  3  FROM t2p PARTITION (p_5)
  4  WHERE 1 = 0;

SQL> ALTER TABLE t2p EXCHANGE PARTITION p_5 WITH TABLE t2p_5;

SQL> ALTER TABLE t2p DROP PARTITION p_5;

SQL> ALTER TABLE t2p ADD PARTITION p_5 VALUES (5);

SQL> ALTER TABLE t2p EXCHANGE PARTITION p_5 WITH TABLE t2p_5;

SQL> DROP TABLE t2p_5 PURGE;

  • Check whether the order is ok

SQL> SELECT t1p.high_value,
  2         t1p.partition_position AS pos_t1p,
  3         t2p.partition_position AS pos_t2p,
  4         decode(t1p.partition_position, t2p.partition_position, 'Y', 'N') AS equal
  5  FROM user_tab_partitions t1p JOIN user_tab_partitions t2p ON t1p.partition_name = t2p.partition_name
  6  WHERE t1p.table_name = 'T1P'
  7  AND t2p.table_name = 'T2P';

HIGH_VALUE   POS_T1P  POS_T2P EQUAL
----------- -------- -------- ------
0                  1        1 Y
1                  2        2 Y
2                  3        3 Y
3                  4        4 Y
4                  5        5 Y
6                  6        6 Y
7                  7        7 Y
8                  8        8 Y
9                  9        9 Y
5                 10       10 Y

After these operations partition-wise joins are allowed. The following execution plan confirms this.

SQL> SELECT * FROM table(dbms_xplan.display(format=>'basic'));

PLAN_TABLE_OUTPUT
------------------------------------

Plan hash value: 1324269388

------------------------------------
| Id  | Operation           | Name |
------------------------------------
|   0 | SELECT STATEMENT    |      |
|   1 |  PARTITION LIST ALL |      |
|   2 |   HASH JOIN         |      |
|   3 |    TABLE ACCESS FULL| T1P  |
|   4 |    TABLE ACCESS FULL| T2P  |
------------------------------------


Jul 13 2010

Oracle OpenWorld Schedule

Tag: Speaking, TOPChristian Antognini @ 9:33 am

Back from two weeks of vacation I noticed that the schedule of the next OpenWorld is available here.

The detailled information about my session, which is based on chapter 10 of my book, is the following:

ID# S316683
Title Join Techniques
Abstract This presentation explains how the query optimizer joins multiple sets of data to each other. First it explains the operation of the basic join methods (nested loop, hash join, and merge join) and the possibilities we have to influence their performance. Then it presents some more advanced optimization techniques such as the transformations applied to joins, and partition-wise joins.
Track Database
Date 22-SEP-2010
Time 13:00 – 14:00
Venue Moscone South
Room Rm 304

I’m looking forward to seeing you in San Francisco!

Update 2009-09-01: location was changed from “Rm 200″ to “Rm 304″.


Jun 24 2010

Troubleshooting Oracle Performance – Downloadable Files

Tag: TOPChristian Antognini @ 10:29 am

This is just a short note to point out that I just uploaded a new version of the scripts related to TOP. The new ZIP is available through this page.

The change log is the following:

connect.sql Added DBM10205, DBA10205, DBM11201 and DBA11201
chapter02\bind_variables.sql Because of 11g modified/added queries against V$SQL_SHARED_CURSOR
chapter02\sharable_cursors.sql Added SET SERVEROUTPUT OFF in the initialization part
chapter03\dbms_hprof.sql New file
chapter03\sql_trace_trigger.sql New file
chapter06\display_awr.sql Improved query that displays AWR content
chapter06\execution_plans.sql Added example for UNION ALL (RECURSIVE WITH)
chapter07\baseline_evolution_delete.sql New file
chapter07\baseline_upgrade_11g.sql After import added update to set the owner of the SQL tuning set
chapter07\opt_estimate.sql Uncommented 11g query
chapter07\outline_with_hj.sql Script compatible with 10g/11g (set “_hash_join_enabled”)
chapter07\tune_last_statement.sql Added SET SERVEROUTPUT OFF in the initialization part
chapter08\client-side_caching.sql New file
chapter09\conditions.sql Added queries containing NOT IN condition
chapter09\hash_cluster.sql Changed comment related to IN operator because of 11.2 improvement
chapter10\hash_join.sql Fixed typo in description
chapter10\join_elimination.sql Fixed typo in description
chapter10\join_elimination2.sql New file
chapter10\pwj.sql Disabled join-filter pruning
chapter10\subquery_unnesting.sql Cover many more cases
chapter11\ArrayInterface.java Added check for the return value of the executeBatch method
chapter11\ArrayInterfacePerf.java Fixed number of iterations in main method
chapter11\atomic_refresh.sql Changed CTAS to avoid ORA-30009
chapter11\dpi_performance.sql Changed CTAS to avoid ORA-30009
chapter11\px_auto_dop.sql New file
chapter11\px_ddl.sql Changed the part displaying the parallel DDL status
chapter11\px_dml.sql Changed the part displaying the parallel DML status
chapter11\px_query.sql Changed the part displaying the parallel query status
chapter11\result_cache_plsql.sql Added comment about invalidation in 11.2
chapter12\data_compression.sql Changed CTAS to avoid ORA-30009
databases\DBA10205 New directory containing the files to create the database DBA10205
databases\DBM10205 New directory containing the files to create the database DBM10205
databases\DBA11201 New directory containing the files to create the database DBA11201
databases\DBM11201 New directory containing the files to create the database DBM11201

Jun 10 2010

Related-Combine Operation „UNION ALL (RECURSIVE WITH)“

Tag: 11gR2, Query Optimizer, TOPChristian Antognini @ 7:53 am

To make easier the interpretation of execution plans, in chapter 6 of TOP I defined three types of operations: standalone operations, unrelated-combine operations, and related-combine operations. For combine operations I also added a list of all operations of each type. Since in 11.2 a new related-combine operation is available, I decided to write this short post as addenda to the content of the book.

The new related-combine operation, named “UNION ALL (RECURSIVE WITH)”, is available to support the new recursive subquery factoring clause. Hence, it is used for hierarchical queries. The following query and its execution plan show an example:

SQL> WITH
  2    e (xlevel, empno, ename, job, mgr, hiredate, sal, comm, deptno)
  3    AS (
  4      SELECT 1, empno, ename, job, mgr, hiredate, sal, comm, deptno
  5      FROM emp
  6      WHERE mgr IS NULL
  7      UNION ALL
  8      SELECT mgr.xlevel+1, emp.empno, emp.ename, emp.job, emp.mgr, emp.hiredate, emp.sal, emp.comm, emp.deptno
  9      FROM emp, e mgr
 10      WHERE emp.mgr = mgr.empno
 11    )
 12  SELECT *
 13  FROM e;

-------------------------------------------------------------------------------
| Id  | Operation                                 | Name    | Starts | A-Rows |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |         |      1 |     14 |
|   1 |  VIEW                                     |         |      1 |     14 |
|   2 |   UNION ALL (RECURSIVE WITH) BREADTH FIRST|         |      1 |     14 |
|*  3 |    TABLE ACCESS FULL                      | EMP     |      1 |      1 |
|   4 |    NESTED LOOPS                           |         |      4 |     13 |
|   5 |     NESTED LOOPS                          |         |      4 |     13 |
|   6 |      RECURSIVE WITH PUMP                  |         |      4 |     14 |
|*  7 |      INDEX RANGE SCAN                     | EMP_MGR |     14 |     13 |
|   8 |     TABLE ACCESS BY INDEX ROWID           | EMP     |     13 |     13 |
-------------------------------------------------------------------------------

   3 - filter("MGR" IS NULL)
   7 - access("EMP"."MGR"="MGR"."EMPNO")
       filter("EMP"."MGR" IS NOT NULL)

Notice that there are actually two operations:

  • UNION ALL (RECURSIVE WITH) BREADTH FIRST
  • UNION ALL (RECURSIVE WITH) DEPTH FIRST

As their name suggest, the difference is due to the search clause that you can set to either BREADTH FIRST BY or DEPTH FIRST BY.

Reading an execution plan containing the “UNION ALL (RECURSIVE WITH)” operation is the same as reading one containing the “CONNECT BY WITH FILTERING” operation. As a matter of fact, the purpose of both operations is basically the same. Just notice that also the “PUMP” operation used in the execution plan differs. While in the former it is called “RECURSIVE WITH PUMP”, in the latter it is called “CONNECT BY PUMP”. But the difference, for the purpose of reading the execution plan, does not matter.

You find a full description on how to read such an execution plan in this post.


« Previous PageNext Page »