The Importance of the In-Memory DUPLICATE Clause for a RAC System

10 November 2014 20 Comments

Written by Christian Antognini

With the INMEMORY clause you can specify 4 sub-clauses:

The MEMCOMPRESS clause specifies whether and how compression is used
The PRIORITY clause specifies the priority (“order”) in which the segments are loaded when the IMCS is populated
The DISTRIBUTE clause specifies how data is distributed across RAC instances
The DUPLICATE clause specifies whether and how data is duplicated across RAC instances

The aim of this post is not to describe these attributes in detail. Instead, this post shows you the impact of the DUPLICATE clause when a query is executed in a RAC environment. For a short description, have a look to this table in the Database Reference guide. For more details, have a look to the Oracle Database In-Memory white paper.

It is essential to understand that the DUPLICATE clause (also known as “Fault Tolerant In-Memory Column Store”), according to the Oracle Database Licensing Information guide, requires Exadata or Supercluster. Also note that, according to this Twitter conversation, Oracle also plans to support the feature on ODA. Hence, the idea is to support it on all engineered systems.

In this post I would like to show you examples that make clear why the DUPLICATE clause is key to get good performance on a RAC system. To do so, I will execute a simple query like “SELECT sum(col) FROM tab” with different settings.

First of all, let me set the scene…

Since the In-Memory option is required, I have to use version 12.1.0.2.0

SQL> SELECT banner
  2  FROM v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
PL/SQL Release 12.1.0.2.0 - Production
CORE    12.1.0.2.0      Production
TNS for Linux: Version 12.1.0.2.0 - Production
NLSRTL Version 12.1.0.2.0 - Production

What you cannot see from the previous output is that the database is hosted on RAC/Exadata system
The test table stores 12GB of data in about 1.5 million blocks (BTW, this is the same table I used in this post)

SQL> SELECT blocks, round(blocks*8192/1024/1024) AS size_mb
  2  FROM user_tables
  3  WHERE table_name = 'T';

    BLOCKS    SIZE_MB
---------- ----------
   1550061      12110

Note that parallel processing is disabled at the table level
At the beginning, make sure that the In-Memory Column Store (IMCS) is not used for the test table

SQL> ALTER TABLE t NO INMEMORY;

SQL> SELECT inst_id, populate_status, inmemory_size, bytes, bytes_not_populated
  2  FROM gv$im_user_segments
  3  WHERE segment_name = 'T';

no rows selected

TEST 1

Let me prepare the environment for the first test:

One instance is up and running

SQL> SELECT count(*)
  2  FROM gv$instance;

  COUNT(*)
----------
         1

Specify to use the IMCS for the test table and trigger the population

SQL> ALTER TABLE t INMEMORY;

SQL> SELECT count(*)
  2  FROM t;

  COUNT(*)
----------
 100000000

After waiting for few minutes, check that the population is over (POPULATE_STATUS=COMPLETED) and that the test table is completely stored in the IMCS (BYTES_NO_POPULATED=0).

SQL> SELECT inst_id, populate_status, inmemory_size, bytes, bytes_not_populated
  2  FROM gv$im_user_segments
  3  WHERE segment_name = 'T';

   INST_ID POPULATE_ INMEMORY_SIZE      BYTES BYTES_NOT_POPULATED
---------- --------- ------------- ---------- -------------------
         1 COMPLETED    1.0921E+10 1.2707E+10                   0

Finally, flush the buffer cache (note that the IMCS is not impacted by this action)

SQL> ALTER SYSTEM FLUSH BUFFER_CACHE;

Now that the environment is ready, I execute the test query and, at the same time, I activate extended SQL trace and gather runtime information:

The execution took 2.75 seconds (UPDATE 2014-12-11: The execution isn’t as fast as you might expect because SQL trace was enabled during the test. Without SQL trace the execution is about 10 times faster.)

SQL> execute dbms_session.session_trace_enable

SQL> SELECT /*+ gather_plan_statistics */ sum(n_32)
  2  FROM t;

 SUM(N_32)
----------
1550000000

Elapsed: 00:00:02.75

SQL> execute dbms_session.session_trace_disable

A serial execution taking advantage of an IM scan (TABLE ACCESS INMEMORY FULL) was used

SQL> SELECT * FROM table(dbms_xplan.display_cursor('7qry07h06c2xs',0,'iostats last'));

SQL_ID  7qry07h06c2xs, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */ sum(n_32) FROM t

Plan hash value: 2966233522

-------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |      1 |        |      1 |00:00:02.73 |       4 |      2 |
|   1 |  SORT AGGREGATE             |      |      1 |      1 |      1 |00:00:02.73 |       4 |      2 |
|   2 |   TABLE ACCESS INMEMORY FULL| T    |      1 |    100M|    100M|00:00:00.91 |       4 |      2 |
-------------------------------------------------------------------------------------------------------

The corresponding extended SQL trace information as shown by TKPROF is the following

SELECT /*+ gather_plan_statistics */ sum(n_32)
FROM t

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      2.72       2.73          2          4          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      2.72       2.73          2          4          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 95
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=4 pr=2 pw=0 time=2730184 us)
 100000000  100000000  100000000   TABLE ACCESS INMEMORY FULL T (cr=4 pr=2 pw=0 time=907260 us cost=17717 size=300000000 card=100000000)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  cell single block physical read                 2        0.00          0.00
  SQL*Net message from client                     2        0.00          0.00

In summary, the execution was, for a query that had to process 12GB of data, quite fast. This was possible because all data was extracted from the IMCS.

TEST 2

The second test uses two instances. As a result, after starting the second instance, the Im-Memory attributes of the test table needed to be changed, and the IMCS had to be re-populated.

Two instances are up and running

SQL> SELECT count(*)
  2  FROM gv$instance;

  COUNT(*)
----------
         2

Data is duplicated on both instances

SQL> ALTER TABLE t INMEMORY DUPLICATE;

Trigger population

SQL> SELECT count(*)
  2  FROM t;

  COUNT(*)
----------
 100000000

After waiting for few minutes, check that the population is over and that test table is completely stored in both IMCS

SQL> SELECT inst_id, populate_status, inmemory_size, bytes, bytes_not_populated
  2  FROM gv$im_user_segments
  3  WHERE segment_name = 'T';

   INST_ID POPULATE_ INMEMORY_SIZE      BYTES BYTES_NOT_POPULATED
---------- --------- ------------- ---------- -------------------
         2 COMPLETED    1.0921E+10 1.2707E+10                   0
         1 COMPLETED    1.0921E+10 1.2707E+10                   0

Flush the buffer cache of both instances

Of course the test is the same as for TEST 1. The only difference is that the test table is stored in the IMCS of both instances.

The execution took 2.97 seconds (UPDATE 2014-12-11: The execution isn’t as fast as you might expect because SQL trace was enabled during the test. Without SQL trace the execution is about 10 times faster.)

SQL> execute dbms_session.session_trace_enable

SQL> SELECT /*+ gather_plan_statistics */ sum(n_32)
  2  FROM t;

 SUM(N_32)
----------
1550000000

Elapsed: 00:00:02.97

SQL> execute dbms_session.session_trace_disable

It goes without saying that the execution plan is the same as for TEST 1

SQL> SELECT * FROM table(dbms_xplan.display_cursor('7qry07h06c2xs',0,'iostats last'));

SQL_ID  7qry07h06c2xs, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */ sum(n_32) FROM t

Plan hash value: 2966233522

-------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |      1 |        |      1 |00:00:02.97 |       4 |      2 |
|   1 |  SORT AGGREGATE             |      |      1 |      1 |      1 |00:00:02.97 |       4 |      2 |
|   2 |   TABLE ACCESS INMEMORY FULL| T    |      1 |    100M|    100M|00:00:01.02 |       4 |      2 |
-------------------------------------------------------------------------------------------------------

And here is the extended SQL trace information

SELECT /*+ gather_plan_statistics */ sum(n_32)
FROM t

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      2.95       2.96          2          4          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      2.95       2.97          2          4          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 95
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=4 pr=2 pw=0 time=2967854 us)
 100000000  100000000  100000000   TABLE ACCESS INMEMORY FULL T (cr=4 pr=2 pw=0 time=1023321 us cost=17717 size=300000000 card=100000000)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  cell single block physical read                 2        0.00          0.00
  SQL*Net message from client                     2        0.00          0.00

In summary, there is no real difference between TEST 1 and TEST 2.

TEST 3

The third test also uses two instances. But, this time, data is not fully duplicated on each instance. Instead, it is only distributed. This means that every row of the test table is stored in the IMCS of either one instance or the other.

Let me prepare the environment:

Two instances are up and running

SQL> SELECT count(*)
  2  FROM gv$instance;

  COUNT(*)
----------
         2

Data is not duplicated on both instances

SQL> ALTER TABLE t INMEMORY NO DUPLICATE;

Trigger population

SQL> SELECT count(*)
  2  FROM t;

  COUNT(*)
----------
 100000000

After waiting for few minutes, check that the population is over. Notice that since no instance contains all data, the BYTES_NOT_POPULATED column remains greater than 0

SQL> SELECT inst_id, populate_status, inmemory_size, bytes, bytes_not_populated
  2  FROM gv$im_user_segments
  3  WHERE segment_name = 'T';

   INST_ID POPULATE_ INMEMORY_SIZE      BYTES BYTES_NOT_POPULATED
---------- --------- ------------- ---------- -------------------
         2 COMPLETED    5726535680 1.2707E+10          6029524992
         1 COMPLETED    5194317824 1.2707E+10          6652174336

Flush the buffer cache of both instances

Once more, the same SQL statements are used to run the test:

The execution took 2 minutes and 40 seconds!!!!!

SQL> execute dbms_session.session_trace_enable

SQL> SELECT /*+ gather_plan_statistics */ sum(n_32)
  2  FROM t;

 SUM(N_32)
----------
1550000000

Elapsed: 00:02:39.95

SQL> execute dbms_session.session_trace_disable

The execution plan is the same as for TEST 1 and TEST 2. The difference is that a huge number of logical and physical I/O were performed

SQL> SELECT * FROM table(dbms_xplan.display_cursor('7qry07h06c2xs',0,'iostats last'));

SQL_ID  7qry07h06c2xs, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */ sum(n_32) FROM t

Plan hash value: 2966233522

-------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |      1 |        |      1 |00:02:39.94 |    1086K|   1086K|
|   1 |  SORT AGGREGATE             |      |      1 |      1 |      1 |00:02:39.94 |    1086K|   1086K|
|   2 |   TABLE ACCESS INMEMORY FULL| T    |      1 |    100M|    100M|00:02:31.06 |    1086K|   1086K|
-------------------------------------------------------------------------------------------------------

The information provided by extended SQL trace confirms what dbms_xplan shows. In addition, it also shows that the poor performance is due to the high number of single-block physical reads that were performed (about 275K)

SELECT /*+ gather_plan_statistics */ sum(n_32)
FROM t

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     93.50     159.94    1086979    1086981          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     93.50     159.94    1086979    1086981          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 95
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=1086981 pr=1086979 pw=0 time=159942519 us)
 100000000  100000000  100000000   TABLE ACCESS INMEMORY FULL T (cr=1086981 pr=1086979 pw=0 time=151063172 us cost=18478 size=300000000 card=100000000)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Disk file operations I/O                        1        0.00          0.00
  SQL*Net message to client                       2        0.00          0.00
  cell single block physical read            274946        0.03         73.01
  gc cr multi block request                    6368        0.03          4.19
  cell multiblock physical read                6368        0.02         10.00
  latch: cache buffers lru chain                  6        0.00          0.00
  latch: gcs resource hash                        2        0.00          0.00
  latch: gc element                             113        0.00          0.00
  SQL*Net message from client                     2        0.00          0.00

Since the bad performance is caused by the number of physical I/O, it goes without saying that if only logical I/O are taking place, the performance is much better. In this case, an immediate re-execution of the query took 8.28 seconds.

SQL> SELECT /*+ gather_plan_statistics */ sum(n_32)
  2  FROM t;

 SUM(N_32)
----------
1550000000

Elapsed: 00:00:08.28

SQL> SELECT * FROM table(dbms_xplan.display_cursor('7qry07h06c2xs',0,'iostats last'));

---------------------------------------------------------------------------------------
SQL_ID  7qry07h06c2xs, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */ sum(n_32) FROM t

Plan hash value: 2966233522

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |      1 |        |      1 |00:00:08.28 |    1086K|
|   1 |  SORT AGGREGATE             |      |      1 |      1 |      1 |00:00:08.28 |    1086K|
|   2 |   TABLE ACCESS INMEMORY FULL| T    |      1 |    100M|    100M|00:00:08.26 |    1086K|
----------------------------------------------------------------------------------------------

In summary, only part of the data was accessed through the IMCS. In addition, despite the fact that an IM scan was processed, except if data was in the buffer cache, many single-block physical reads were carried out. As a result, the performance was bad. This behaviour shows that a session is only able to access the IMCS it is managed by the instance it is connected to. This is also a good example showing that accessing data through the IMCS is faster than accessing the same data through the buffer cache.

TEST 4
The fourth and final test is a continuation of the previous one. Its aim is to check the performance of a parallel query. Hence, I added the PARALLEL hint to the test query. Also note that I flushed the buffer cache of both instances before executing the test query.

SQL> SELECT /*+ gather_plan_statistics parallel */ sum(n_32)
  2  FROM t;

 SUM(N_32)
----------
1550000000

Elapsed: 00:00:01.49

SQL> SELECT * FROM table(dbms_xplan.display_cursor('br2as2yzy7gk4',2,'iostats last'));

SQL_ID  br2as2yzy7gk4, child number 2
-------------------------------------
SELECT /*+ gather_plan_statistics parallel */ sum(n_32) FROM t

Plan hash value: 3126468333

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |          |      1 |        |      1 |00:00:01.41 |       7 |      1 |
|   1 |  SORT AGGREGATE                 |          |      1 |      1 |      1 |00:00:01.41 |       7 |      1 |
|   2 |   PX COORDINATOR                |          |      1 |        |     10 |00:00:01.34 |       7 |      1 |
|   3 |    PX SEND QC (RANDOM)          | :TQ10000 |      0 |      1 |      0 |00:00:00.01 |       0 |      0 |
|   4 |     SORT AGGREGATE              |          |      0 |      1 |      0 |00:00:00.01 |       0 |      0 |
|   5 |      PX BLOCK ITERATOR          |          |      0 |    100M|      0 |00:00:00.01 |       0 |      0 |
|*  6 |       TABLE ACCESS INMEMORY FULL| T        |      0 |    100M|      0 |00:00:00.01 |       0 |      0 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - inmemory(:Z>=:Z AND :Z<=:Z)

Note
-----
   - automatic DOP: Computed Degree of Parallelism is 10
   - parallel scans affinitized for inmemory

The execution took 1.49 seconds. In other words, the performance of the parallel query is way better than the one of TEST 3. The reason can be seen in the Note section of the dbms_xplan output: "parallel scans affinitized for inmemory". This means that parallel slaves were started on both instances and that every set of parallel slaves accessed only the data stored in the IMCS of the instance they were attached to. As a result, even though the data was distributed, the performance was good. The only "overhead" compared to the serial execution is that the parallel slaves had to communicate with the query coordinator.

WHAT WE LEARNED

DUPLICATE is only applicable to engineered systems (as I write this blog, ODA does not yet support it)
On a RAC system, without DUPLICATE, data is distributed across all instances
A process can only access data stored in the IMCS of the instance it is connected to
Even though an IM scan is used, serial executions on RAC systems might require many logical/physical I/O. (UPDATE 2014-12-11: Parallel executions using Auto DOP don't suffer from this problem.)

In summary, I would not advise to use the In-Memory option on a RAC system that does not support the DUPLICATE clause. At least, not without Auto DOP. (UPDATE 2014-12-12: the previous sentence was "At least, not in general.")

12.1.0.2, Exadata, In-Memory, Parallel Processing, RAC, SQL Trace

20 Comments

6 Pings/Trackbacks

Yasin Baskan

10 November 2014

Hi Christian,

Not every parallel query can access IM segments on other nodes. As explained in the related whitepaper (http://www.oracle.com/technetwork/database/in-memory/overview/twp-oracle-database-in-memory-2245633.html) Auto DOP is required for this. Your statement level hint enables Auto DOP which provides the affinity of scans. So you need Auto DOP enabled either with parallel_degree_policy=auto (or adaptive) or with a hint like you did.

Reply
- Christian Antognini
  
  11 November 2014
  
  Hi
  
  You are absolutely right. Thank you for pointing it out.
  
  Best,
  Chris
  
  Reply

Jonathan Lewis

11 November 2014

Chris,

Was your tablespace locally managed with uniform extents of 1MB ? If so that could explain the 275,000 single block reads in Test 3

Christian Antognini

11 November 2014

Hi Jonathan

No uniform extent size. Here are the details:

SQL> SELECT extent_management, allocation_type, segment_space_management
  2  FROM user_tablespaces
  3  WHERE tablespace_name = (SELECT tablespace_name
  4                           FROM user_tables
  5                           WHERE table_name = 'T');

MGM ALL SEGMEN
--- --- ------
LOC SYS AUTO

SQL> SELECT bytes, count(*)
  2  FROM user_extents
  3  WHERE segment_name = 'T'
  4  GROUP BY bytes
  5  ORDER BY bytes;

     BYTES   COUNT(*)
---------- ----------
     65536         16
   1048576         63
   8388608        120
  29360128          1
  63963136          2
  67108864        171

Could you please explain why a uniform extent size of 1MB could explain that?

BTW, most multi-block reads (cell multiblock physical read) are of 1MB.

Thank you,
Chris

Quick Links to Helpful Literature on Oracle Database In-Memory Column Store. | Kevin Closson's Blog: Platforms, Databases and Storage

12 November 2014

[…] Fellow Oaktable Network member Christian Antognini has produced a very important article regarding how Oracle Database 12c In-Memory Column Store with Real Application Clusters is questionable unless using Oracle storage (Exadata, SPARC SuperCluster). I found Christian’s article very interesting because, after all, the topic at hand is an in-memory cache product (a.k.a., In-Memory Column Store). I fail to see any technical reason why Oracle wouldn’t support an in-memory product with blocks from any and all storage. It is in-memory after all, isn’t it? Please visit Christian’s article here: The Importance of the In-Memory DUPLICATE Clause for a RAC System. […]
Mark W. Farnham

12 November 2014

This all begs the question of why you are not allowed to control instances considered for the distribution of the non-duplicate in memory column store. Perhaps this non-feature (surely trivial to NOT distribute at all if only a single instance is listed for a table) yielding your only one instance up results if the correct instance is used for the query is the continuance of the deprecation of application affinity. Perhaps Oracle will see the light and realize allowing instance designation dovetailed with application affinity could allow customers to have a functional column store much larger in aggregate for the operational cost of grouping analytical targets by instance. To me this seems like a worthwhile feature to add for non-engineered systems in particular, but for any RAC implementation where natural opportunities for application affinity exist.

Reply
Kerry Osborne

12 November 2014

Hi Chris,

I am surprised by all the single block reads in test 3l. In my tests it generally uses multi-block reads. Do you have ideas as to why that might be occurring?

Kerry

Reply
- Christian Antognini
  
  13 November 2014
  
  Hi Kerry
  
  Unfortunately, I didn’t have time to look at it in detail. My present guess: it is because of the way data is distributed. For example, I would like to test whether I am able to reproduce the same issue with a partitioned table using DISTRIBUTE BY PARTITION.
  
  In any case, the problem was reproduced independently by a colleague of mine on a different system (a non-engineered one) with completely different data.
  
  Best,
  Chris
  
  Reply
Brian Peasland

14 November 2014

I find the DUPLICATE clause to be misleading in non-Engineered systems. You can specify it with no errors. Explain Plans on other instances show the InMemory option is used. Yet the segment is only in one instance’s IM store. I blogged my findings here: http://www.peasland.net/2014/11/13/inmemory-duplicate-confusion-in-oracle-rac/

Thanks,
Brian

Reply
- Christian Antognini
  
  14 November 2014
  
  Hi Brian
  
  I fully agree with you. It should not be possible to set it in case it’s not used.
  What I do not understand in what you posted is why data is not distributed. Did you wait long enough?
  
  Best,
  Chris
  
  Reply
Bertrand

5 December 2014

Hi Christian,

I tried to get rid of the default behaviour on non Engineered Systems (“Every row of the test table is stored in the IMCS of either one instance or the other”) for 2 reasons:

1) Your finding: It could lead to bad performance.
2) In case of service defined as preferred/available, it would make sense to have the IMCS populated with all the DATA on the “active” Instance.

To get rid of the default behaviour, we can set the “_inmemory_auto_distribute” to false.

I tested it for:

RAC (non CDB) here: https://bdrouvot.wordpress.com/2014/11/20/in-memory-instance-distribution-with-rac-databases-i-want-at-least-one-instance-that-contains-all-the-data/
RAC with PDB here: https://bdrouvot.wordpress.com/2014/11/21/in-memory-instance-distribution-with-rac-and-multitenant-environment/
RAC with PDB and service defined as preferred/available here (Note that in that case FAN callouts can be enough): https://bdrouvot.wordpress.com/2014/12/04/fan-callouts-for-in-memory-instance-distribution-with-rac-pdb-and-preferredavailable-service/

Thx
Bertrand

Reply
In-memory DB | Oracle Scratchpad

26 January 2015

[…] Christian Antognini’s blog item on the Duplicate clause in RAC. […]
In-Memory Instance distribution with RAC databases: I want at least one instance that contains all the data | bdt's oracle blog

21 March 2015

[…] Christian Antognini show us that having the data not fully populated on each instance could lead to bad performance into this blog post. […]
In-Memory Instance distribution with RAC databases: Impact of the parallel_instance_group parameter | bdt's oracle blog

26 March 2015

[…] This distribution could lead to performance issue. […]
FAN callouts for In-Memory Instance distribution with RAC, PDB and preferred/available service | bdt's oracle blog

27 March 2015

[…] Antognini show us into this blog post that having the data not fully populated on an Instance (default behaviour on non Engineered […]
Les bonnes raisons d’utiliser Exadata et Oracle DB-Inmemory ensemble | Bastien Leblanc

11 June 2015

[…] Un très bon blog de Christian Antognini à lire sur le sujet: http://antognini.ch/2014/11/the-importance-of-the-in-memory-duplicate-clause-for-a-rac-system/ […]
Alberto Dell'Era

6 June 2016

Hi Chris,

It’s also worth knowing that, perhaps unsurprisingly, if parallel_force_local=true the IM segments stored on other instances cannot be accessed,
and the local instance will ask the rows to the cells/disk/buffer cache instead.
Just checked on our X5-2 12.1.2.2.1.160119.

Hence using DISTRIBUTE (DISTRIBUTE ALL, in the general case of 3+ RAC instances) becomes of paramount importance if you want to (or are forced to due to bugs) use this setting …

ciao
al

Reply
- Christian Antognini
  
  6 June 2016
  
  Ciao Alberto
  
  Thank you for sharing the result of your test. Honestly, I’m not surprised. At all.
  
  Best,
  Chris
  
  Reply
Giuseppe

21 November 2020

Hi Chris,
In addition to single block physical read, on Test 3 I noticed another “stranger things”.
The elapsed time of the statement is about 160 seconds, but if I sum cpu time (93.5) and wait time (73.01+4.19.10), the elapsed time should be 180.7 seconds.
How can this is possible?
What am I doing wrong?

Thanks and regards.

Giuseppe

Reply
- Christian Antognini
  
  23 November 2020
  
  Ciao Giuseppe
  
  I ought to be an instrumentation problem. In fact, the total elapsed time matches to the time shown by SQL*Plus.
  Since the test was carried out in 12.1.0.2, it could be “Bug 16915985 : INCORRECT WAIT TIMING WITH SERIAL DIRECT PATH READS ON ASM WITH SYNCHRONOUS IO.”
  
  Hence, it’s not you doing something wrong.
  
  Best,
  Chris
  
  Reply