Jun 07 2010

Evolution of a SQL Plan Baseline Based on a DELETE Statement

Tag: 11gR1, 11gR2, Query Optimizer, SQL TraceChristian Antognini @ 3:21 pm

During an evolution the database engine compares the performance of two execution plans. The aim is to find out which one provides the better performance. For that purpose it has to run the SQL statement on which the SQL plan baseline is based and compare some execution statistics. The following output of the DBMS_SPM.EVOLVE_SQL_PLAN_BASELINE function shows which statistics are compared.

Plan was verified: Time used .06 seconds.
Plan passed performance criterion: 360.12 times better than baseline plan.
Plan was changed to an accepted plan.

                          Baseline Plan      Test Plan       Stats Ratio
                          -------------      ---------       -----------
Execution Status:              COMPLETE       COMPLETE
Rows Processed:                     100            100
Elapsed Time(ms):                 2.173           .033             65.85
CPU Time(ms):                     2.444              0
Buffer Gets:                        720              2               360
Physical Read Requests:               0              0
Physical Write Requests:              0              0
Physical Read Bytes:                  0              0
Physical Write Bytes:                 0              0
Executions:                           1              1

For queries a regular execution can be performed. But, what happens for INSERT/UPDATE/MERGE/DELETE statements? Do the SQL engine really execute them and modify data?

To answer these questions let’s have a look to an example based on a DELETE statement…

  • Setup a table used for the test:

SQL> CREATE TABLE t (id, n, pad, CONSTRAINT t_pk PRIMARY KEY (id))
  2  AS
  3  SELECT rownum, mod(rownum,100), rpad('*',500,'*')
  4  FROM dual
  5  CONNECT BY level <= 10000;

SQL> execute dbms_stats.gather_table_stats(ownname => user, tabname => 't', method_opt => 'for all columns size 254')

  • Create a SQL plan baseline:

SQL> ALTER SESSION SET optimizer_capture_sql_plan_baselines = TRUE;

SQL> DELETE t WHERE n = 42;

SQL> ROLLBACK;

SQL> DELETE t WHERE n = 42;

SQL> ROLLBACK;

SQL> ALTER SESSION SET optimizer_capture_sql_plan_baselines = FALSE;

  • Add a non-accepted execution plan to the SQL plan baseline:

SQL> CREATE INDEX i ON t (n);

SQL> DELETE t WHERE n = 42;

SQL> ROLLBACK;

SQL> DELETE t WHERE n = 42;

SQL> ROLLBACK;

  • Display the content of the SQL plan baseline (notice that two execution plans are available):

SQL> SELECT * FROM table(dbms_xplan.display_sql_plan_baseline('SYS_SQL_373d78bbba048c24', NULL, 'basic'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
SQL handle: SYS_SQL_373d78bbba048c24
SQL text: DELETE t WHERE n = 42
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_3fgbsrfx093143bad20a0         Plan id: 1001201824
Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

Plan hash value: 3335594643

-----------------------------------
| Id  | Operation          | Name |
-----------------------------------
|   0 | DELETE STATEMENT   |      |
|   1 |  DELETE            | T    |
|   2 |   TABLE ACCESS FULL| T    |
-----------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_3fgbsrfx093144198692b         Plan id: 1100507435
Enabled: YES     Fixed: NO      Accepted: NO      Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

Plan hash value: 1582806765

----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | DELETE STATEMENT  |      |
|   1 |  DELETE           | T    |
|   2 |   INDEX RANGE SCAN| I    |
----------------------------------

  • Trace the evolution to find out what happens (notice that I deleted the output of the function because it is the one it is shown at the top of this post):

SQL> execute dbms_monitor.session_trace_enable(plan_stat=>'ALL_EXECUTIONS')

SQL> SELECT dbms_spm.evolve_sql_plan_baseline(
  2           sql_handle => 'SYS_SQL_373d78bbba048c24',
  3           plan_name  => '',
  4           time_limit => 10,
  5           verify     => 'yes',
  6           commit     => 'yes'
  7         )
  8  FROM dual;

SQL> execute dbms_monitor.session_trace_disable

SQL> SELECT value
  2  FROM v$diag_info
  3  WHERE name = 'Default Trace File';

VALUE
-------------------------------------------------------------------
/u00/app/oracle/diag/rdbms/dba112/DBA112/trace/DBA112_ora_17200.trc

Now that the trace file was generated, let’s have a look to its content. The relevant parts are two: the first one is related to the execution of the accepted execution plan, and the second one is related to the execution of the non-accepted one.

PARSING IN CURSOR #11 len=45 dep=1 uid=90 oct=7 lid=90 tim=1275524159625080 hv=4077337184 ad='325c9f10' sqlid='5fwyncmthffm0'
/* SQL Analyze(25,0) */ DELETE t WHERE n = 42
END OF STMT
PARSE #11:c=1000,e=652,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=1001201824,tim=1275524159625078
EXEC #11:c=4999,e=5670,p=0,cr=720,cu=0,mis=0,r=0,dep=1,og=1,plh=1001201824,tim=1275524159630752
EXEC #11:c=2000,e=1718,p=0,cr=720,cu=0,mis=0,r=0,dep=1,og=1,plh=1001201824,tim=1275524159632613
EXEC #11:c=2000,e=1511,p=0,cr=720,cu=0,mis=0,r=0,dep=1,og=1,plh=1001201824,tim=1275524159634156
EXEC #11:c=2000,e=1542,p=0,cr=720,cu=0,mis=0,r=0,dep=1,og=1,plh=1001201824,tim=1275524159636144
EXEC #11:c=2000,e=1552,p=0,cr=720,cu=0,mis=0,r=0,dep=1,og=1,plh=1001201824,tim=1275524159638151
EXEC #11:c=3998,e=4015,p=0,cr=720,cu=0,mis=0,r=0,dep=1,og=1,plh=1001201824,tim=1275524159642613
EXEC #11:c=3000,e=2905,p=0,cr=720,cu=0,mis=0,r=0,dep=1,og=1,plh=1001201824,tim=1275524159645549
EXEC #11:c=2000,e=1506,p=0,cr=720,cu=0,mis=0,r=0,dep=1,og=1,plh=1001201824,tim=1275524159647151
EXEC #11:c=2000,e=1562,p=0,cr=720,cu=0,mis=0,r=0,dep=1,og=1,plh=1001201824,tim=1275524159649160
EXEC #11:c=2999,e=2440,p=0,cr=720,cu=0,mis=0,r=0,dep=1,og=1,plh=1001201824,tim=1275524159652037
CLOSE #11:c=0,e=3,dep=1,type=0,tim=1275524159652065

PARSING IN CURSOR #5 len=45 dep=1 uid=90 oct=7 lid=90 tim=1275524159657503 hv=4077337184 ad='325c9f10' sqlid='5fwyncmthffm0'
/* SQL Analyze(25,0) */ DELETE t WHERE n = 42
END OF STMT
PARSE #5:c=1000,e=859,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=1100507435,tim=1275524159657499
EXEC #5:c=0,e=52,p=0,cr=2,cu=0,mis=0,r=0,dep=1,og=1,plh=1100507435,tim=1275524159657625
EXEC #5:c=0,e=0,p=0,cr=2,cu=0,mis=0,r=0,dep=1,og=1,plh=1100507435,tim=1275524159657647
EXEC #5:c=0,e=31,p=0,cr=2,cu=0,mis=0,r=0,dep=1,og=1,plh=1100507435,tim=1275524159657972
EXEC #5:c=0,e=5,p=0,cr=2,cu=0,mis=0,r=0,dep=1,og=1,plh=1100507435,tim=1275524159658071
EXEC #5:c=0,e=0,p=0,cr=2,cu=0,mis=0,r=0,dep=1,og=1,plh=1100507435,tim=1275524159658071
EXEC #5:c=0,e=0,p=0,cr=2,cu=0,mis=0,r=0,dep=1,og=1,plh=1100507435,tim=1275524159658071
EXEC #5:c=0,e=0,p=0,cr=2,cu=0,mis=0,r=0,dep=1,og=1,plh=1100507435,tim=1275524159658071
EXEC #5:c=0,e=0,p=0,cr=2,cu=0,mis=0,r=0,dep=1,og=1,plh=1100507435,tim=1275524159658071
EXEC #5:c=0,e=0,p=0,cr=2,cu=0,mis=0,r=0,dep=1,og=1,plh=1100507435,tim=1275524159658071
EXEC #5:c=0,e=0,p=0,cr=2,cu=0,mis=0,r=0,dep=1,og=1,plh=1100507435,tim=1275524159658071
CLOSE #5:c=0,e=0,dep=1,type=0,tim=1275524159658071

In the previous output notice that:

  • The PLH attribute of the EXEC lines shows that two execution plans are used.
  • Each execution plan was executed 10 times (in practice the number varies according to the elapsed time; i.e. for longer executions a single run might be enough to determine whether an execution plan has to be accepted).
  • Even though I set the PLAN_STAT parameter to ALL_EXECUTIONS (if you don’t know what the PLAN_STAT parameter is for, have a look to this post) the STAT lines (the execution plan) are not available in the trace file.

According to this information the SQL statement is executed. But, if you check the table after the evolution, the data is still there. And that, honestly, is not an option! In addition, no ROLLBACK is executed (no XCTEND lines are present in the trace file). So, it seems that the SQL statement is not executed.

What I really miss in the trace file are the execution plans associated to the executions to check what the different operations of the execution plan did. The only way I found to have them, it is to add the GATHER_PLAN_STATISTICS hint into the SQL statement itself (also setting the STATISTICS_LEVEL parameter and checking a view like V$SQL_PLAN_STATISTICS_ALL did not help). The content of the trace file, formatted by TVD$XTAT, is the following:

Optimizer Mode       ALL_ROWS
Hash Value           1001201824
Number of Executions 10

        Rows Operation
------------ ---------------------------------------------------------------------------------------
           0 DELETE  T (cr=720 pr=25 pw=0 time=0 us)
           0   TABLE ACCESS FULL T (cr=720 pr=25 pw=0 time=0 us cost=84 size=700 card=100)

Optimizer Mode       ALL_ROWS
Hash Value           1100507435
Number of Executions 10

        Rows Operation
------------ ---------------------------------------------------------------------------------------
           0 DELETE  T (cr=2 pr=0 pw=0 time=0 us)
           0   INDEX RANGE SCAN I (cr=2 pr=0 pw=0 time=0 us cost=1 size=700 card=100) (object id 93840)

Notice that while the number of logical reads (CR attribute) matches the report generated by the evolution, the number of rows returned by both steps of the execution plans is 0. And that, even though the index range scan should return 100 rows.

In summary, during an evolution the SQL engine processes the SQL statements in a special way. The data is accessed, but not modified. Hence, SQL statements are only partially executed. I do not regard this fact as a problem, though. In fact, the operations that modify data should always perform the same work independently on how the data to be modified is located (in the example given here, either with a full table scan or an index range scan).


Jun 03 2010

Optimizer Mode Mismatch Does Not Prevent Sharing of Child Cursor!?!?

Tag: 10gR1, 10gR2, 11gR1, 11gR2, 9iR2, Bug, Query Optimizer, SQL TraceChristian Antognini @ 6:40 pm

The aim of this post is to describe a strange (buggy) situation that I observed recently. But before doing that, I shortly summarize what a parent cursor and a child cursor are as well as when they can be shared. By the way, I borrowed this description from the pages 20/21 of my book. Hence, if you are interested in more information about this topic refer to it…

The result of a parse operation is a parent cursor and a child cursor stored in the library cache.

The key information related to a parent cursor is the text of the SQL statement. Therefore, several SQL statements share the same parent cursor if their text is exactly the same (note that there is at least an exception to this, specifically when cursor sharing is used). In the following example, four SQL statements are executed. Two have the same text. Two others differ only because of lowercase and uppercase letters or blanks. Through the V$SQLAREA view, it is possible to confirm that three distinct parent cursors were created.

SQL> ALTER SYSTEM FLUSH SHARED_POOL;

SQL> SELECT * FROM t WHERE n = 1234;

SQL> select * from t where n = 1234;

SQL> SELECT * FROM t WHERE n=1234;

SQL> SELECT * FROM t WHERE n = 1234;

SQL> SELECT sql_id, sql_text, executions
  2  FROM v$sqlarea
  3  WHERE sql_text LIKE '%1234';

SQL_ID        SQL_TEXT                          EXECUTIONS
------------- --------------------------------- ----------
2254m1487jg50 select * from t where n = 1234             1
g9y3jtp6ru4cb SELECT * FROM t WHERE n = 1234             2
7n8p5s2udfdsn SELECT * FROM t WHERE n=1234               1

The key information related to a child cursor is the execution plan and the execution environment related to it. The execution environment is important because if it changes, the execution plan might change as well. As a result, several SQL statements are able to share the same child cursor only if they share the same parent cursor and their execution environments are compatible. To illustrate, the same SQL statement is executed with two different values of the initialization OPTIMIZER_MODE parameter. The result is that a single parent cursor and two child cursors are created.

SQL> ALTER SESSION SET optimizer_mode = all_rows;

SQL> SELECT count(*) FROM t;

COUNT(*)
----------
      1000

SQL> ALTER SESSION SET optimizer_mode = first_rows_10;

SQL> SELECT count(*) FROM t;

COUNT(*)
----------
      1000

SQL> SELECT sql_id, child_number, sql_text, optimizer_mode, plan_hash_value
  2  FROM v$sql
  3  WHERE sql_id = (SELECT prev_sql_id
  4  FROM v$session
  5  WHERE sid = sys_context('userenv','sid'));

SQL_ID        CHILD_NUMBER SQL_TEXT               OPTIMIZER_MODE PLAN_HASH_VALUE
------------- ------------ ---------------------- -------------- ---------------
5tjqf7sx5dzmj            0 SELECT count(*) FROM t ALL_ROWS            2966233522
5tjqf7sx5dzmj            1 SELECT count(*) FROM t FIRST_ROWS          2966233522

To know which mismatch led to several child cursors, you can query the V$SQL_SHARED_CURSOR view.

SQL> SELECT child_number, optimizer_mode_mismatch
  2  FROM v$sql_shared_cursor
  3  WHERE sql_id = '5tjqf7sx5dzmj';

CHILD_NUMBER OPTIMIZER_MODE_MISMATCH
------------ -----------------------
           0 N
           1 Y

So far, so good… Now, let’s see what’s strange…

The interesting thing to point out about the previous example is that while I set FIRST_ROWS_10 as optimizer mode, the V$SQL view displayed the value FIRST_ROWS. Mhmm… That’s strange… They are two different optimizer modes. They cannot be considered equivalent. What are the implications? It is just the view that provides the wrong information or the database engine is able to share the same child cursor even with two different values of the OPTIMIZER_MODE parameter? Let’s try it with FIRST_ROWS (i.e. without “_10”)…

 SQL> ALTER SESSION SET optimizer_mode = first_rows;

SQL> SELECT sql_id, child_number, sql_text, optimizer_mode, executions
  2  FROM v$sql
  3  WHERE sql_id = (SELECT prev_sql_id
  4                  FROM v$session
  5                  WHERE sid = sys_context('userenv','sid'));

SQL_ID        CHILD_NUMBER SQL_TEXT                          OPTIMIZER_MODE EXECUTIONS
------------- ------------ --------------------------------- -------------- ----------
5tjqf7sx5dzmj            0 SELECT count(*) FROM t            ALL_ROWS                1
5tjqf7sx5dzmj            1 SELECT count(*) FROM t            FIRST_ROWS              2

Oh, damn! Even though the OPTIMIZER MODE is set to a different value the same child cursor is used. Since in this particular situation the execution plans associated to both child cursors are the same (their hash value are equal), it’s not a real problem. But, in practice, it might be possible that two different optimizer modes lead to different execution plans. The following example illustrates this.

  • Build a table for the test:

SQL> CREATE TABLE t AS
  2  SELECT rownum AS id, rpad('*',500,'*') AS pad
  3  FROM dual
  4  CONNECT BY level <= 1000;

SQL> CREATE UNIQUE INDEX i ON t (id);

SQL> execute dbms_stats.gather_table_stats(user, 'T')

  • Show that different values of the OPTIMIZER_MODE parameter lead to different execution plans:

SQL> ALTER SESSION SET optimizer_mode = FIRST_ROWS_1;

SQL> EXPLAIN PLAN FOR SELECT * FROM t WHERE id <= 500;

SQL> SELECT * FROM table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
Plan hash value: 242607798

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     3 |  1515 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T    |     3 |  1515 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | I    |       |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"<=500)

SQL> ALTER SESSION SET optimizer_mode = FIRST_ROWS_1000;

SQL> EXPLAIN PLAN FOR SELECT * FROM t WHERE id <= 500;

SQL> SELECT * FROM table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   500 |   246K|    10   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |   500 |   246K|    10   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ID"<=500)

  • Execute the test query with both values of the OPTIMIZER_MODE parameter:

SQL> ALTER SYSTEM FLUSH SHARED_POOL;

SQL> ALTER SESSION SET optimizer_mode = FIRST_ROWS_1;

SQL> SELECT * FROM t WHERE id <= 500;

        ID PAD
---------- ----------
         1 **********
         2 **********
…
       499 **********
       500 **********

SQL> ALTER SESSION SET optimizer_mode = FIRST_ROWS_1000;

SQL> SELECT * FROM t WHERE id <= 500;

        ID PAD
---------- ----------
         1 **********
         2 **********
…
       499 **********
       500 **********

  • Show that a single execution plan was used for both executions:

SQL> SELECT * FROM table(dbms_xplan.display_cursor(NULL,NULL));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
SQL_ID  2vw03p929jzgz, child number 0
-------------------------------------
SELECT * FROM t WHERE id <= 500

Plan hash value: 242607798

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |       |       |     3 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T    |     3 |  1515 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | I    |       |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"<=500)

SQL> SELECT sql_id, child_number, executions, optimizer_mode
  2  FROM v$sql
  3  WHERE sql_id = '2vw03p929jzgz';

SQL_ID        CHILD_NUMBER EXECUTIONS OPTIMIZER_MODE
------------- ------------ ---------- --------------
2vw03p929jzgz            0          2 FIRST_ROWS

Even though it is not very likely that this bug (yes, in my opinion something like this cannot be considered a restriction of the implementation…) has an impact on a production system, I really don’t understand why the developers didn’t implement it correctly. It should not be that difficult to manage a byte containing the information about the used optimizer mode! Note that this is not the only case where something like that happens with the first rows optimizer mode. For example, also in a trace file generated through SQL trace no difference is made between the old and the new first row optimizer. So, it seams that they really got it wrong.


May 20 2010

How Good Are the Values Returned by DBMS_COMPRESSION.GET_COMPRESSION_RATIO?

Tag: 11gR2, Bug, ExadataChristian Antognini @ 1:00 am

According to the documentation the GET_COMPRESSION_RATIO procedure of the DBMS_COMPRESSION package can be used to assess the impact of different compression options for a given table. In other words, it allows us to find out the expected compression ratio for a given set of data without having to really create a compressed table. The question is: how good are the values it returns?

Before answering this question it is essential to point out two things. First, this package is available as of 11.2 only. Second, it can be used to find out (and compare) the expected compression ratio of OLTP compression as well as Exadata Hybrid Columnar Compression (EHCC). Note that for getting values about EHCC it is not required to actually have access to an Exadata Storage Server. How is that possible?

To answer this second question, let’s describe how such an analysis is performed…

The following PL/SQL block shows how to start an analysis for all uncompressed tables of the current user.

DECLARE
  l_blkcnt_cmp       BINARY_INTEGER;
  l_blkcnt_uncmp     BINARY_INTEGER;
  l_row_cmp          BINARY_INTEGER;
  l_row_uncmp        BINARY_INTEGER;
  l_cmp_ratio        NUMBER;
  l_comptype_str     VARCHAR2(100);
BEGIN
  FOR i IN (SELECT table_name
            FROM user_tables
            WHERE compression = 'DISABLED'
            ORDER BY table_name)
  LOOP
    FOR j IN 1..5
    LOOP
      dbms_compression.get_compression_ratio(
        -- input parameters
        scratchtbsname   => 'SCRATCH',       -- scratch tablespace
        ownname          => user,            -- owner of the table
        tabname          => i.table_name,    -- table name
        partname         => NULL,            -- partition name
        comptype         => power(2,j),      -- compression algorithm
        -- output parameters
        blkcnt_cmp       => l_blkcnt_cmp,    -- number of compressed blocks
        blkcnt_uncmp     => l_blkcnt_uncmp,  -- number of uncompressed blocks
        row_cmp          => l_row_cmp,       -- number of rows in a compressed block
        row_uncmp        => l_row_uncmp,     -- number of rows in an uncompressed block
        cmp_ratio        => l_cmp_ratio,     -- compression ratio
        comptype_str     => l_comptype_str   -- compression type
      );
      dbms_output.put_line(i.table_name||' - '||'type: '||l_comptype_str||' ratio: '||to_char(l_cmp_ratio,'99.999'));
    END LOOP;
  END LOOP;
END;

As you can see the idea is that we provide to the package a table and, based on its data, the package estimates the compression ratio that can be achieved with the specified algorithm. Notice that as input parameter a scratch tablespace is also specified. This is necessary because the package, to estimate the output parameters, actually creates a compressed table. If you trace an execution you would see two CREATE TABLE statements.

  • The first one creates a scratch table based on a sample of the content of the input table (the sampling percentage is chosen based on the size of the table; simply put the sampling percentage is inversely proportional to the table size):

CREATE TABLE dbms_tabcomp_temp_uncmp
TABLESPACE <tablespace name>
AS
SELECT * FROM <table name> SAMPLE BLOCK( <sampling percentage> )

  • The second one creates another scratch table where the data is compressed according to the specified compression algorithm (in this case COMPRESS FOR QUERY HIGH):

CREATE TABLE dbms_tabcomp_temp_uncmp
ORGANIZATION HEAP
TABLESPACE <tablespace name>
COMPRESS FOR QUERY HIGH
AS
SELECT * FROM dbms_tabcomp_temp_uncmp

In other words the estimations are based on data that is actually compressed. Not on some heuristics…

The essential thing to note is that the second CREATE TABLE statement cannot be directly executed without an Exadata Storage Server. In fact, if you try to run it, the database engine raises an ORA-64307 (hybrid columnar compression is only supported in tablespaces residing on Exadata storage). Based on that observation it is sensible to say that in the current implementation there is a software lock that prevents us from using EHCC without an Exadata Storage Server. But, as just described, that lock can be overridden by the DBMS_COMPRESSION package.

Now, back to the initial question: how good are the values it returns?

To answer this question I installed a TPC-H schema (scale factor 10) and compared the estimations of the package with the actual values. For that purpose I created one table with each type of compression.

  • The compression ratios based on the BLOCKS column of the USER_TABLES view are the following:

SQL> SELECT comp.table_name, round(uncomp.blocks/comp.blocks,3) AS ratio
  2  FROM user_tables comp, user_tables uncomp
  3  WHERE comp.compression = 'ENABLED'
  4  AND uncomp.compression = 'DISABLED'
  5  AND comp.table_name LIKE uncomp.table_name||'/_%' ESCAPE '/'
  6  ORDER BY uncomp.table_name, nullif(comp.compress_for,'OLTP') DESC;

TABLE_NAME                          RATIO
------------------------------ ----------
CUSTOMER_OLTP                       1.041
CUSTOMER_QUERY_LOW                  2.220
CUSTOMER_QUERY_HIGH                 3.708
CUSTOMER_ARCHIVE_LOW                3.837
CUSTOMER_ARCHIVE_HIGH               4.432
LINEITEM_OLTP                       1.487
LINEITEM_QUERY_LOW                  3.034
LINEITEM_QUERY_HIGH                 4.912
LINEITEM_ARCHIVE_LOW                5.144
LINEITEM_ARCHIVE_HIGH               6.704
ORDERS_OLTP                         1.149
ORDERS_QUERY_LOW                    2.887
ORDERS_QUERY_HIGH                   5.038
ORDERS_ARCHIVE_LOW                  5.305
ORDERS_ARCHIVE_HIGH                 6.704
PART_OLTP                           1.328
PART_QUERY_LOW                      3.307
PART_QUERY_HIGH                     6.718
PART_ARCHIVE_LOW                    7.296
PART_ARCHIVE_HIGH                  11.049
PARTSUPP_OLTP                       0.991
PARTSUPP_QUERY_LOW                  2.892
PARTSUPP_QUERY_HIGH                 5.428
PARTSUPP_ARCHIVE_LOW                5.659
PARTSUPP_ARCHIVE_HIGH               8.071

  • The output of the PL/SQL block shown above is the following (be careful that the order is slightly different):

CUSTOMER - type: "Compress For OLTP" ratio:   1.041
CUSTOMER - type: "Compress For Query High" ratio:   3.742
CUSTOMER - type: "Compress For Query Low" ratio:   2.228
CUSTOMER - type: "Compress For Archive High" ratio:   4.460
CUSTOMER - type: "Compress For Archive Low" ratio:   3.894
ORDERS - type: "Compress For OLTP" ratio:   1.149
ORDERS - type: "Compress For Query High" ratio:   5.048
ORDERS - type: "Compress For Query Low" ratio:   2.886
ORDERS - type: "Compress For Archive High" ratio:   6.651
ORDERS - type: "Compress For Archive Low" ratio:   5.325
PART - type: "Compress For OLTP" ratio:   1.328
PART - type: "Compress For Query High" ratio:   6.778
PART - type: "Compress For Query Low" ratio:   3.338
PART - type: "Compress For Archive High" ratio:  11.126
PART - type: "Compress For Archive Low" ratio:   7.343
PARTSUPP - type: "Compress For OLTP" ratio:    .990
PARTSUPP - type: "Compress For Query High" ratio:   5.451
PARTSUPP - type: "Compress For Query Low" ratio:   2.893
PARTSUPP - type: "Compress For Archive High" ratio:   8.051
PARTSUPP - type: "Compress For Archive Low" ratio:   5.657

By comparing the estimations with the actual values it is possible to say that the value returned by the DBMS_COMPRESSION package are very accurate. Having so accurate estimations is a good thing because you probably want to know how much space you can save before doing a major reorganization of large tables as well as before buying an Exadata Storage Server just to test how effective the compression of EHCC is.

You might ask why the second output does not provide information about the LINEITEM table. The problem is that the package was not able to process it. In fact, the following error was raised:

ORA-00942: table or view does not exist
ORA-06512: at "SYS.PRVT_COMPRESSION", line 459
ORA-30562: SAMPLE percentage must be in the range [0.000001,100)
ORA-06512: at "SYS.DBMS_COMPRESSION", line 214
ORA-06512: at line 16

The problem is (probably) due to a rounding performed during the selection of the sampling percentage. In fact, for the LINEITEM table (which is the bigger one), the following CREATE TABLE statement was generated (notice that the sampling percentage is set to 0):

CREATE TABLE dbms_tabcomp_temp_uncmp
TABLESPACE scratch
AS
SELECT *
FROM lineitem SAMPLE BLOCK( 0)

I was not able to find a bug in MOS, but it is definitely one.


May 05 2010

Exadata Storage Server and the Query Optimizer – Part 3

Tag: 11gR1, 11gR2, Exadata, Parallel Processing, Query OptimizerChristian Antognini @ 3:55 pm

In the first and second post of this series I shared with you some basics about smart scan and gave some details about projection and restriction. The aim of this post is to cover the third basic technique: join filtering.

Join filtering is not something specific to the Exadata Storage Server. In fact, it is an Enterprise Edition feature available since Oracle Database 10g Release 2. Simply put, it is used to reduce data communication between slave processes in parallel joins. For more information about it I suggest you to read a paper I published in June 2008 entitled Bloom Filters. In it I describe not only what bloom filters are, but also how Oracle Database uses them. And, one of the use cases is join filtering.

What I want to show here is how Exadata Storage Server is able to take advantage of join filtering. For that purpose let’s have a look to the following execution plan:

-----------------------------------------------------
| Id  | Operation                        | Name     |
-----------------------------------------------------
|   0 | SELECT STATEMENT                 |          |
|   1 |  PX COORDINATOR                  |          |
|   2 |   PX SEND QC (RANDOM)            | :TQ10002 |
|*  3 |    HASH JOIN BUFFERED            |          |
|   4 |     JOIN FILTER CREATE           | :BF0000  |
|   5 |      PX RECEIVE                  |          |
|   6 |       PX SEND HASH               | :TQ10000 |
|   7 |        PX BLOCK ITERATOR         |          |
|*  8 |         TABLE ACCESS STORAGE FULL| T1       |
|   9 |     PX RECEIVE                   |          |
|  10 |      PX SEND HASH                | :TQ10001 |
|  11 |       JOIN FILTER USE            | :BF0000  |
|  12 |        PX BLOCK ITERATOR         |          |
|* 13 |         TABLE ACCESS STORAGE FULL| T2       |
-----------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."ID"="T2"."ID")
   8 - storage("T1"."MOD"=42)
       filter("T1"."MOD"=42)
  13 - storage(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."ID"))
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."ID"))

As you can see, join filtering is used. In fact, the operation 4 (JOIN FILTER CREATE) builds a bloom filter that, later on, is used by operation 11 (JOIN FILTER USE) to filter out the data that does not fulfill the join condition. However, the most important thing to notice in this execution plan is the STORAGE predicate applied by the operation 13. According to it the bloom filter is applied not only by the operation 11, but also by the operation 13. And, since the operation 13 is a smart scan operation, the STORAGE predicate is evaluated by the cells. This means that the reduction of data communication does not only take place between slave processes, but also between the cells and the database instances. Remarkable!


May 04 2010

Native Full Outer Join Officially Available in 10.2.0.5

Tag: 10gR2, Query OptimizerChristian Antognini @ 12:25 pm

Today I installed for the first time the patchset 10.2.0.5. While reading the README file, I noticed the following piece of information.

To enable a new native full outer join implementation in the database, a user has to set the following underscore parameter:

_optimizer_native_full_outer_join =force

You can set this parameter for the system or for a specific session.

Besides dramatically improving the performance of a full outer join, the new implementation fixes a variety of issues, for examples a variety of ORA-942 (table or view doesn’t exists) and ORA-4331 (unable to allocate string bytes of shared memory) errors.

This issue is tracked with Oracle bug 6322672.

Great! At last we can officially take advantage of native full outer join also in 10.2 (the feature was officially introduced in 11.1, but was already “available” in 10.2.0.3).

Here is an example:

  • By default native full outer joins are disabled (notice the implementation with the UNION ALL operation):

SQL> SELECT * FROM emp FULL OUTER JOIN dept USING (deptno);

Execution Plan
----------------------------------------------------------
Plan hash value: 2291915024

---------------------------------------------
| Id  | Operation            | Name         |
---------------------------------------------
|   0 | SELECT STATEMENT     |              |
|   1 |  VIEW                |              |
|   2 |   UNION-ALL          |              |
|*  3 |    HASH JOIN OUTER   |              |
|   4 |     TABLE ACCESS FULL| EMP          |
|   5 |     TABLE ACCESS FULL| DEPT         |
|   6 |    NESTED LOOPS ANTI |              |
|   7 |     TABLE ACCESS FULL| DEPT         |
|*  8 |     INDEX RANGE SCAN | EMP_DEPTNO_I |
---------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("EMP"."DEPTNO"="DEPT"."DEPTNO"(+))
   8 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")

  • As suggested by the README file, the feature can be enabled at the session level:

SQL> ALTER SESSION SET "_optimizer_native_full_outer_join" = force;

SQL> SELECT * FROM emp FULL OUTER JOIN dept USING (deptno);

Execution Plan
----------------------------------------------------------
Plan hash value: 51889263

------------------------------------------
| Id  | Operation             | Name     |
------------------------------------------
|   0 | SELECT STATEMENT      |          |
|   1 |  VIEW                 | VW_FOJ_0 |
|*  2 |   HASH JOIN FULL OUTER|          |
|   3 |    TABLE ACCESS FULL  | DEPT     |
|   4 |    TABLE ACCESS FULL  | EMP      |
------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")


« Previous PageNext Page »