Instance caging is another small but useful feature of Oracle Database 11g Release 2. Thanks to it the database resource manager is able, for the first time, to limit the number of CPUs that can be used by a given instance. (By the way, note that this limit has no “impact” on the number of CPUs that have to be licensed.)
Using instance caging requires only two simple configurations from our part:
- Enable the resource manager by assigning a resource plan (e.g. DEFAULT_PLAN)
- Set the number of CPUs through the CPU_COUNT initialization parameter
To illustrate how it works, let me show you what I observed on my small test server with 4 cores.
To burn all the available CPU resources, I started four sessions executing the following PL/SQL block:
DECLARE n NUMBER; BEGIN WHILE (TRUE) LOOP n:= dbms_random.random(); END LOOP; END;
With the four sessions up and running, the output of vmstat(8) looks like the following:
procs ------------memory------------ --swap-- ----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa 5 0 0 6037164 71568 5476620 0 0 66 198 1033 823 100 0 0 0 5 0 0 6037164 71572 5476616 0 0 0 158 1021 801 100 0 0 0 5 0 0 6037164 71572 5476616 0 0 0 48 1016 750 100 0 0 0
Notice that there is no idle time and that the number of processes waiting for run time is 5.
Now, to show what instance caging can do to limit the CPU utilization, I started the following PL/SQL block:
DECLARE l_sql VARCHAR2(100) := 'ALTER SYSTEM SET cpu_count = '; BEGIN EXECUTE IMMEDIATE l_sql || '4'; dbms_lock.sleep(10); EXECUTE IMMEDIATE l_sql || '3'; dbms_lock.sleep(10); EXECUTE IMMEDIATE l_sql || '2'; dbms_lock.sleep(10); EXECUTE IMMEDIATE l_sql || '1'; dbms_lock.sleep(10); EXECUTE IMMEDIATE l_sql || '0'; END;
This time the output of vmstat(8) looks like the following:
procs ------------memory------------ --swap-- ----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa 5 0 0 3695176 80856 7780352 0 0 0 96 1019 783 100 0 0 0 5 0 0 3695176 80856 7780352 0 0 0 170 1023 795 100 0 0 0 5 0 0 3695184 80856 7780352 0 0 0 156 1028 795 100 0 0 0 5 0 0 3695184 80860 7780348 0 0 0 120 1021 795 100 0 0 0 5 0 0 3694928 80860 7780348 0 0 2 168 1029 810 100 0 0 0 3 0 0 3694928 80860 7780348 0 0 0 214 1029 939 77 0 23 0 3 0 0 3694928 80864 7780344 0 0 0 118 1020 961 75 0 25 0 3 0 0 3694928 80864 7780344 0 0 2 152 1026 961 75 0 25 0 3 0 0 3694596 80868 7780340 0 0 2 158 1027 975 75 0 25 0 3 0 0 3694612 80868 7780340 0 0 0 142 1031 979 75 0 25 0 3 0 0 3694612 80868 7780340 0 0 2 164 1024 963 75 0 25 0 3 0 0 3694616 80872 7780336 0 0 0 358 1079 961 52 0 49 0 2 0 0 3694616 80872 7780336 0 0 0 120 1021 909 50 0 50 0 2 0 0 3697312 80872 7780336 0 0 0 162 1025 952 50 0 50 0 2 0 0 3694744 80876 7780332 0 0 0 142 1027 948 50 0 50 0 1 0 0 3694744 80876 7780332 0 0 0 120 1021 954 40 0 60 0 1 0 0 3694748 80876 7780332 0 0 0 234 1034 953 26 0 74 0 1 0 0 3694748 80876 7780332 0 0 0 134 1021 921 26 0 74 0 1 0 0 3696484 80876 7780332 0 0 0 120 1020 954 26 0 74 0 1 0 0 3696476 80880 7780328 0 0 2 196 1035 996 26 0 74 0 5 0 0 3696476 80880 7780328 0 0 0 112 1020 643 96 0 4 0 6 0 0 3696484 80880 7780328 0 0 0 216 1040 778 100 0 0 0 5 0 0 3696484 80884 7780324 0 0 0 160 1021 763 100 0 0 0 5 0 0 3696484 80884 7780324 0 0 0 112 1020 775 100 0 0 0 5 0 0 3696468 80888 7780320 0 0 0 156 1026 785 100 0 0 0
Notice that not only the idle time but also the number of processes waiting for run time dynamically changes according to the value specified for the CPU_COUNT initialization parameter. Great!
The only problem I faced during my tests is that the management does not work as expected when lot of time is spent running kernel code. On my server I am able to reproduce such a problem by running the following PL/SQL block:
BEGIN WHILE (TRUE) LOOP FOR i IN (SELECT * FROM t) LOOP NULL; END LOOP; END LOOP; END;
When four sessions run that code, the output of vmstat(8) looks like the following:
procs ------------memory------------ --swap-- ----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa 4 0 0 3690112 81032 7780652 0 0 0 138 1025 786 65 35 0 0 4 0 0 3690112 81036 7780648 0 0 0 182 1024 775 66 34 0 0 4 0 0 3690112 81036 7780648 0 0 0 84 1018 779 66 34 0 0 4 0 0 3690112 81036 7780648 0 0 0 138 1024 822 66 34 0 0 4 0 0 3690112 81040 7780644 0 0 0 204 1032 809 66 34 0 0 4 0 0 3690112 81044 7780640 0 0 0 120 1027 798 63 36 1 0 4 0 0 3690112 81044 7780640 0 0 0 146 1021 833 61 37 2 0 4 0 0 3690112 81044 7780640 0 0 0 160 1023 785 63 35 1 0 5 0 0 3690112 81044 7780640 0 0 0 98 1021 782 62 36 1 0 4 0 0 3690112 81044 7780640 0 0 0 146 1021 818 64 36 1 0 4 0 0 3690048 81044 7780640 0 0 2 180 1031 803 63 36 1 0 3 0 0 3690048 81044 7780640 0 0 0 126 1027 921 53 29 18 0 3 0 0 3690048 81044 7780640 0 0 0 146 1022 988 46 25 29 0 3 0 0 3690044 81048 7780636 0 0 0 180 1032 976 48 25 28 0 3 0 0 3690044 81048 7780636 0 0 0 212 1058 974 48 25 27 0 2 0 0 3690052 81048 7780636 0 0 0 144 1028 950 47 23 30 0 2 0 0 3690052 81048 7780636 0 0 0 204 1035 971 34 14 51 0 2 0 0 3691404 81048 7780636 0 0 0 106 1020 939 32 13 55 0 2 0 0 3691404 81048 7780636 0 0 0 124 1021 972 34 14 53 0 2 0 0 3691404 81048 7780636 0 0 0 202 1032 986 33 14 53 0 4 0 0 3691404 81048 7780636 0 0 0 80 1024 850 51 29 21 0 4 0 0 3691404 81048 7780636 0 0 0 172 1028 790 65 35 0 0 4 0 0 3691404 81048 7780636 0 0 0 220 1033 785 63 37 0 0 5 0 0 3691404 81048 7780636 0 0 0 62 1016 771 64 36 0 0 5 0 0 3691404 81048 7780636 0 0 0 170 1022 785 62 38 0 0 5 0 0 3691404 81048 7780636 0 0 0 18 1026 772 64 36 0 0
Notice that in this case about one third of the time is spent running kernel code. As a result, the CPU utilization (for the values of 3, 2 and 1) is always higher than expected.
Christian, thanks for your update on instance caging.
What leaves me clueless is that setting CPU_COUNT has no influence on the number of CPUs I have to licenses. This makes virtualization of Oracle DBs even more unfeasible. Say e.g., I have one physical machine with four CPUs and four virtual machines running a DB each on it. Until now, I needed to get four 4-CPU licences for that. Now, I can set CPU_COUNT to one for all instances and I still have to get the same licenses. This is ridiculous.
IMHO the only reasonable way of setting the CPU_COUNT is to open the case and physically unplug CPUs from the machine.
@Marcus
This feature is meant to be used for resource management of multiple instances on a single server/platform. It allows one to “curb” instances to limit the maximum CPU. One can set the sum of CPU_COUNT for all instances to match the total physical CPU count (using instance caging to partition) or the sum of the CPU_COUNTs can be higher than the physical CPU count (using instance caging to over-provision, but limit). The latter potentially allows for better host utilization given that each individual instance is below its max CPU threshold.
For licensing, you can use hard partitioning with Oracle VM to limit the CPUs that have to be licensed as shown by this wiki note: http://wiki.oracle.com/page/Hard+partitioning
[…] Antognini is, as always, Striving for Optimal Performance. He has a worthwhile post on instance caging, “ . . . nother small but useful feature of Oracle Database 11g Release […]
[…] 1-How to limit number of CPU for an instance with instance caging feature of 11GR2 ? Christian Antognini-Instance Caging […]
Nice post Christian. Very clear and helpful.
I wonder if Oracle is contemplating moving the CBO to a % based mechanism rather than CPU_COUNT. That would allow us to cage instances to a percentage of CPU rather than a fixed number. No doubt that would be a big effort. If they did it would behave much like the DBRM manages CPU internally through the plan directives.
Hello all,
I’ve been doing some tests on the AIX platform with regards to instance caging.
I’ve set it up on an LPAR with 2 logical CPUs, but have been unable to actually limit the CPU usage.
It seems that no matter what the limit is set to, I can always manage to eat up all the cpu allocation on server level.
This shouldn’t really happen, but I have no clue as to what i might have done wrong. I’ve got the resource_manager setup and set the cpu_count parameter according to the importance (diving to 1 each) over each instance.
Hi Erwin
I’m not aware of particular problems in this area. Since I do not have an AIX system handy, I cannot test it either. You might try to open an SR…
Cheers,
Chris
[…] Antognini is, as always, Striving for Optimal Performance. He has a worthwhile post on instance caging, “ . . . nother small but useful feature of Oracle Database 11g Release […]
[…] dans le white-paper dédié sur OTN et dans plusieurs articles de la blogosphère à commencer par celui de Christian Antognini. La seconde est liée à l’utilisation du paramètre max_utilization_limit de ressource […]
[…] dans le white-paper dédié sur OTN et dans plusieurs articles de la blogosphère à commencer par celui de Christian Antognini. La seconde est liée à l’utilisation du paramètre max_utilization_limit de ressource […]