According to the documentation the GET_COMPRESSION_RATIO procedure of the DBMS_COMPRESSION package can be used to assess the impact of different compression options for a given table. In other words, it allows us to find out the expected compression ratio for a given set of data without having to really create a compressed table. The question is: how good are the values it returns?
Before answering this question it is essential to point out two things. First, this package is available as of 11.2 only. Second, it can be used to find out (and compare) the expected compression ratio of OLTP compression as well as Exadata Hybrid Columnar Compression (EHCC). Note that for getting values about EHCC it is not required to actually have access to an Exadata Storage Server. How is that possible?
To answer this second question, let’s describe how such an analysis is performed…
The following PL/SQL block shows how to start an analysis for all uncompressed tables of the current user.
DECLARE l_blkcnt_cmp BINARY_INTEGER; l_blkcnt_uncmp BINARY_INTEGER; l_row_cmp BINARY_INTEGER; l_row_uncmp BINARY_INTEGER; l_cmp_ratio NUMBER; l_comptype_str VARCHAR2(100); BEGIN FOR i IN (SELECT table_name FROM user_tables WHERE compression = 'DISABLED' ORDER BY table_name) LOOP FOR j IN 1..5 LOOP dbms_compression.get_compression_ratio( -- input parameters scratchtbsname => 'SCRATCH', -- scratch tablespace ownname => user, -- owner of the table tabname => i.table_name, -- table name partname => NULL, -- partition name comptype => power(2,j), -- compression algorithm -- output parameters blkcnt_cmp => l_blkcnt_cmp, -- number of compressed blocks blkcnt_uncmp => l_blkcnt_uncmp, -- number of uncompressed blocks row_cmp => l_row_cmp, -- number of rows in a compressed block row_uncmp => l_row_uncmp, -- number of rows in an uncompressed block cmp_ratio => l_cmp_ratio, -- compression ratio comptype_str => l_comptype_str -- compression type ); dbms_output.put_line(i.table_name||' - '||'type: '||l_comptype_str||' ratio: '||to_char(l_cmp_ratio,'99.999')); END LOOP; END LOOP; END;
As you can see the idea is that we provide to the package a table and, based on its data, the package estimates the compression ratio that can be achieved with the specified algorithm. Notice that as input parameter a scratch tablespace is also specified. This is necessary because the package, to estimate the output parameters, actually creates a compressed table. If you trace an execution you would see two CREATE TABLE statements.
- The first one creates a scratch table based on a sample of the content of the input table (the sampling percentage is chosen based on the size of the table; simply put the sampling percentage is inversely proportional to the table size):
CREATE TABLE dbms_tabcomp_temp_uncmp TABLESPACEAS SELECT * FROM {table name} SAMPLE BLOCK( {sampling percentage} )
- The second one creates another scratch table where the data is compressed according to the specified compression algorithm (in this case COMPRESS FOR QUERY HIGH):
CREATE TABLE dbms_tabcomp_temp_uncmp ORGANIZATION HEAP TABLESPACECOMPRESS FOR QUERY HIGH AS SELECT * FROM dbms_tabcomp_temp_uncmp
In other words the estimations are based on data that is actually compressed. Not on some heuristics…
The essential thing to note is that the second CREATE TABLE statement cannot be directly executed without an Exadata Storage Server. In fact, if you try to run it, the database engine raises an ORA-64307 (hybrid columnar compression is only supported in tablespaces residing on Exadata storage). Based on that observation it is sensible to say that in the current implementation there is a software lock that prevents us from using EHCC without an Exadata Storage Server. But, as just described, that lock can be overridden by the DBMS_COMPRESSION package.
Now, back to the initial question: how good are the values it returns?
To answer this question I installed a TPC-H schema (scale factor 10) and compared the estimations of the package with the actual values. For that purpose I created one table with each type of compression.
- The compression ratios based on the BLOCKS column of the USER_TABLES view are the following:
SQL> SELECT comp.table_name, round(uncomp.blocks/comp.blocks,3) AS ratio 2 FROM user_tables comp, user_tables uncomp 3 WHERE comp.compression = 'ENABLED' 4 AND uncomp.compression = 'DISABLED' 5 AND comp.table_name LIKE uncomp.table_name||'/_%' ESCAPE '/' 6 ORDER BY uncomp.table_name, nullif(comp.compress_for,'OLTP') DESC; TABLE_NAME RATIO ------------------------------ ---------- CUSTOMER_OLTP 1.041 CUSTOMER_QUERY_LOW 2.220 CUSTOMER_QUERY_HIGH 3.708 CUSTOMER_ARCHIVE_LOW 3.837 CUSTOMER_ARCHIVE_HIGH 4.432 LINEITEM_OLTP 1.487 LINEITEM_QUERY_LOW 3.034 LINEITEM_QUERY_HIGH 4.912 LINEITEM_ARCHIVE_LOW 5.144 LINEITEM_ARCHIVE_HIGH 6.704 ORDERS_OLTP 1.149 ORDERS_QUERY_LOW 2.887 ORDERS_QUERY_HIGH 5.038 ORDERS_ARCHIVE_LOW 5.305 ORDERS_ARCHIVE_HIGH 6.704 PART_OLTP 1.328 PART_QUERY_LOW 3.307 PART_QUERY_HIGH 6.718 PART_ARCHIVE_LOW 7.296 PART_ARCHIVE_HIGH 11.049 PARTSUPP_OLTP 0.991 PARTSUPP_QUERY_LOW 2.892 PARTSUPP_QUERY_HIGH 5.428 PARTSUPP_ARCHIVE_LOW 5.659 PARTSUPP_ARCHIVE_HIGH 8.071
- The output of the PL/SQL block shown above is the following (be careful that the order is slightly different):
CUSTOMER - type: "Compress For OLTP" ratio: 1.041 CUSTOMER - type: "Compress For Query High" ratio: 3.742 CUSTOMER - type: "Compress For Query Low" ratio: 2.228 CUSTOMER - type: "Compress For Archive High" ratio: 4.460 CUSTOMER - type: "Compress For Archive Low" ratio: 3.894 ORDERS - type: "Compress For OLTP" ratio: 1.149 ORDERS - type: "Compress For Query High" ratio: 5.048 ORDERS - type: "Compress For Query Low" ratio: 2.886 ORDERS - type: "Compress For Archive High" ratio: 6.651 ORDERS - type: "Compress For Archive Low" ratio: 5.325 PART - type: "Compress For OLTP" ratio: 1.328 PART - type: "Compress For Query High" ratio: 6.778 PART - type: "Compress For Query Low" ratio: 3.338 PART - type: "Compress For Archive High" ratio: 11.126 PART - type: "Compress For Archive Low" ratio: 7.343 PARTSUPP - type: "Compress For OLTP" ratio: .990 PARTSUPP - type: "Compress For Query High" ratio: 5.451 PARTSUPP - type: "Compress For Query Low" ratio: 2.893 PARTSUPP - type: "Compress For Archive High" ratio: 8.051 PARTSUPP - type: "Compress For Archive Low" ratio: 5.657
By comparing the estimations with the actual values it is possible to say that the value returned by the DBMS_COMPRESSION package are very accurate. Having so accurate estimations is a good thing because you probably want to know how much space you can save before doing a major reorganization of large tables as well as before buying an Exadata Storage Server just to test how effective the compression of EHCC is.
You might ask why the second output does not provide information about the LINEITEM table. The problem is that the package was not able to process it. In fact, the following error was raised:
ORA-00942: table or view does not exist ORA-06512: at "SYS.PRVT_COMPRESSION", line 459 ORA-30562: SAMPLE percentage must be in the range [0.000001,100) ORA-06512: at "SYS.DBMS_COMPRESSION", line 214 ORA-06512: at line 16
The problem is (probably) due to a rounding performed during the selection of the sampling percentage. In fact, for the LINEITEM table (which is the bigger one), the following CREATE TABLE statement was generated (notice that the sampling percentage is set to 0):
CREATE TABLE dbms_tabcomp_temp_uncmp TABLESPACE scratch AS SELECT * FROM lineitem SAMPLE BLOCK( 0)
I was not able to find a bug in MOS, but it is definitely one.
This is bug 9086904.
Hi Greg
Thank you for poiting out the bug number. I checked again and, as it happens, it is not public yet (at least, I cannot see it). Anyway, it’s good to know that it’s a known bug.
Cheers,
Chris
Haven’t seen anyone in Oracle world toss the word “database engine”, well maybe you work with other rdbms as well :-)
Hi Walter
Unfortunately, IMO, some words are frequently used in the wrong way. How often I read “… the database does something…” when, in fact, the database is just a buch of files and, therefore, cannot process any kind of activity. So, I use “database engine”. Another word that could be used with Oracle is “instance”. But, I think that a general term that should be self explaining for everyone is better.
Cheers,
Chris
PS: in my book I use the term “database engine” 190 times.
Thanks, Christian, for the very clear demonstration!
I found this package to work quite nicely in my “sandbox” environment, but it worked decidedly more poorly with real production data. I encountered ORA-30562 errors (just as you describe) on any table/partition larger than about 50,000 blocks.
I have had a MOS ticket open on the issue for over a week. Despite having requested escalation several times, the engineer assigned was still insisting that it is a “privilege” problem, even though I have pointed out repeatedly: “I am running it as SYS. How much more privilege do I need?”. *sigh*
My thanks to Greg for mentioning a bug number. I will see if perhaps that information moves my SR along. :-)
Thanks for sharing, just went through a round of doing this by hand on 11.1 to determine if a particular database could reduce physical I/O by using compressed tables or indexes. Would be nice to use this feature once we are up to 11.2 (and get a patch to fix that bug :) ) instead of my homegrown scripts.
i’d like to use this function , but need apply patch 8896202 first and rerun some scripts.
It’s so regretful that we can not test really hybrid columnar compression without Exdata V2.
Hi,
When i am trying to execute the dbms_compression for ratio i am getting an error like
ora-01403 no data found..can u help me with this?
I opened a tar with oracle for the below error.
ERROR at line 1:
ORA-00942: table or view does not exist
ORA-06512: at “SYS.PRVT_COMPRESSION”, line 459
ORA-30562: SAMPLE percentage must be in the range [0.000001,100)
ORA-06512: at “SYS.DBMS_COMPRESSION”, line 214
ORA-06512: at line 9
Well it took a long time to come to a table as some of the test cases ran by oracle returned without any errors due to platform differences. The same procedure works wells on Windows atleast that is what I was told by Oracle. We did a OWC and the analyst finally understood the issue. The patch recommeneded to me was 9381665 that had patches for 2 bugs 9086904 and 9127085. I had trouble applying this patch due to the documentation bug for ASM, Oracle opened a bug to correct the documentation but finally I was able to move forward with the function.
Note: Please open a tar with oracle and consult an SR Analyst about your situation. Do not apply the patch without proper Oracle engagement.
Hi Balaji
Thank you vor sharing what you are experiencing. Out of curiosity: what platform are you using?
Cheers,
Chris
Chris
I am sorry for a delayed response. We are using RHEL Linux 2.6, with 11.2.0.1 + ASM GI.
Thanks
Balaji
I am getting different error for one table ….
ERROR at line 1:
ORA-20000: Compression Advisor TEMP UNCMP table failure
ORA-06512: at “SYS.PRVT_COMPRESSION”, line 776
ORA-06512: at “SYS.DBMS_COMPRESSION”, line 214
ORA-06512: at line 9
Generally this error is suppose to occur for zero rows as per the oracle. But in my case table is having 3 rows