Even though the utilization of an Exadata Storage Server should be transparent for the query optimizer, when you look at execution plans generated in an environment using it you might notice slight differences. The purpose of this series of post is to summarize the differences I was able to observe.

Disclaimer: I do not have access to an Exadata Storage Server (cell, from now on…). All tests I did, i.e. generation of execution plans, were performed with an 11gR2 database using “regular” storage. All I did is to set the CELL_OFFLOAD_PLAN_DISPLAY initialization parameter to ALWAYS. As a result, this post contains untested assumptions based on what I was able to observe in such an environment. It goes without saying that any correction is highly welcome!

Smart scan is one of the key performance features available only when a cell is used. Its purpose is to offload to the storage layer part of the work that would usually be performed by a database instance. As a result, not only the amount of work performed by a database instance might be (much) lower, but also the amount of data transferred from a cell to a database instance might be strongly reduced. When smart scan is used a cell, instead of sending back to the database engine “regular” database blocks (as a “regular” storage layer would do), sends back packets of data containing only the relevant information (I guess they are similar to the messages exchanged during PX operations).

Three are the basic smart scan techniques: projection, restriction and join filtering. As of Oracle Database 11g Release 2 three additional improvements are available: storage indexing, hybrid columnar compression and encryption. But, honestly, I consider them “simple” enhancements (that might have a major impact, though) of the previous ones. In other words, they add nothing to the basic concepts.

In this first post I cover projection. In the next posts I will cover the other two.

Whenever a query does not reference all columns belonging to the tables referenced in the FROM clause AND smart scan is used, it makes no sense that a cell sends back to a database instance the data belonging to the unreferenced columns. Since it makes no sense, it’s not done. Instead, a cell is able to extract only the data of the referenced columns. In this way, especially when queries reference few columns from wide tables, the amount of data received by a database instance is much smaller. As far as I know this technique is not explicitly externalized in execution plans (I mean, you don’t see which columns are actually sent back to the database instance). But, I guess, if you see one of the new operations added to support Exadata (they all contains the keyword “STORAGE” in them), you take advantage of this technique. Here is an example with a FTS (notice the keyword “STORAGE”):

------------------------------------------
| Id  | Operation                 | Name |
------------------------------------------
|   0 | SELECT STATEMENT          |      |
|   1 |  TABLE ACCESS STORAGE FULL| T    |
------------------------------------------

The full list of new “operations” that I was able to observe is the following:

  • STORAGE FULL
  • STORAGE SAMPLE
  • STORAGE SAMPLE BY ROWID RANGE
  • STORAGE BY ROWID RANGE
  • STORAGE SAMPLE FAST FULL SCAN
  • STORAGE FAST FULL SCAN
  • STORAGE FULL SCAN
  • STORAGE RANGE SCAN
  • STORAGE FULL SCAN (MIN/MAX)

So, it seems that for the most common access paths there is a “smart scan version”.