Instance caging vs. CPU pooling

Sven Illert -

When working with Oracle databases the common recommended methods to restrict the usage of cpu resources are the so called instance caging and hard partitioning using something like Oracle Linux Virtualization Manager. The last one comes with some resource wastage though, because you have to handle multiple operating systems for multiple databases. So let’s assume that is some sort of last resort for the licensing dilemma. Instead, many of us like to have multiple databases coupled together in one operating environment and want to have some measures to tame the herd. For that there are several layers where you can put constraints on your databases, beginning with limiting the number of sessions a database can handle. But to reduce a databases cpu usage to a specific load and/or a subset of your Multi-Core/Multi-Stocket system, there currently are two methods commonly used. Look at the end of this article to see how I implemented the testing method.

Instance caging

First, let’s have a look the most obvious way that is used in most modern deployments on Oracle engineered systems or in the Oracle cloud infrastructure. It’s name is instance caging and is configured with the initialization parameter CPU_COUNT. By default this parameter has the value of the cpu thread count in your system. Today most x86 compatible systems do have some sort of multi threading, which means that the parameter mostly will be set to “NumberOfSockets x NumberOfCores x 2” initially if you don’t use some sort of Oracle based tailoring mentioned before (engineered systems, OCI). But be aware of the fact that if you are using an Oracle enterprise edition database, you also have to specify a resource manager plan which is typically not set by default. If you just want to use a predefined plan in a CDB, you can use the following SQL statement.

ALTER SYSTEM SET resource_manager_plan='DEFAULT_CDB_PLAN';

With the use of instance caging, the resource manager reduces the databases load to the number of logical cpus specified by the CPU_COUNT parameter. This is even done implicitly in the Standard Edition 2 version of the database software (actually, this is what is used to limit the database to 16 threads in the SE2). Let’s have a look what happens to the cpu load of all cpus in a VM with 2 Cores and 4 and 5 parallel sessions. Below you can see the load of the database system before any test excution.

% top -u 12345 -1 -bn1
top - 21:57:27 up  2:07,  1 user,  load average: 0.50, 1.64, 2.64
Tasks: 520 total,   1 running, 504 sleeping,  15 stopped,   0 zombie
%Cpu0  :  0.0 us,  5.6 sy,  0.0 ni, 94.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  5.3 us, 21.1 sy,  0.0 ni, 73.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :  5.6 us,  0.0 sy,  0.0 ni, 94.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

We can see that all 4 cpu threads are idling around. Now let’s put some workload on the database.

% sql / as sysdba @$PWD/runtest.sql > /dev/null &
[1] 65189
% sql / as sysdba @$PWD/runtest.sql > /dev/null &
[2] 65235
% sql / as sysdba @$PWD/runtest.sql > /dev/null &
[3] 65354
% sql / as sysdba @$PWD/runtest.sql > /dev/null &
[4] 65419
% sql / as sysdba @$PWD/runtest.sql > /dev/null &
[5] 65476

After some time we get a pretty high load, which will cause an increasingly slow database and system. We can see that the load average rises above 5 (which is more than the amount of threads available for the VM) and the idle part of the cpu usage is zero.

% top -u 12345 -1 -bn1
top - 21:59:28 up  2:09,  1 user,  load average: 5.32, 2.83, 2.93
Tasks: 532 total,  10 running, 507 sleeping,  15 stopped,   0 zombie
%Cpu0  : 83.3 us, 16.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  : 95.8 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  4.2 hi,  0.0 si,  0.0 st
%Cpu2  : 95.7 us,  4.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

The good thing about the instance caging is, that all necessary parameters are modifiable at runtime and don’t need a downtime. So let’s set the cpu count to 2 (which means 2 logical cpus or threads), set a resource manager plan and wait some time to let the database throttle itself.

% sql / as sysdba
SQL> alter system set cpu_count=2;

System SET altered.

SQL> ALTER SYSTEM SET resource_manager_plan='DEFAULT_CDB_PLAN';

System SET altered.

After some time we can check the load and usage of the cpus and we can see that it has dropped to an amount where the system is not overloaded anymore, but the sessions are still running.

% top -u 12345 -1 -bn1
top - 22:13:07 up  2:23,  1 user,  load average: 2.93, 3.80, 4.06
Tasks: 538 total,   7 running, 516 sleeping,  15 stopped,   0 zombie
%Cpu0  : 25.0 us,  0.0 sy,  0.0 ni, 70.0 id,  0.0 wa,  5.0 hi,  0.0 si,  0.0 st
%Cpu1  :  5.3 us, 15.8 sy,  0.0 ni, 78.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  : 30.0 us, 10.0 sy,  0.0 ni, 55.0 id,  0.0 wa,  5.0 hi,  0.0 si,  0.0 st
%Cpu3  : 23.8 us,  0.0 sy,  0.0 ni, 66.7 id,  0.0 wa,  4.8 hi,  4.8 si,  0.0 st

% sql / as sysdba
SQL> select sid, sql_id, status from v$session where username='SYS';

   SID           SQL_ID    STATUS
______ ________________ _________
    23 0gkvd80767mt0    ACTIVE
    26 0gkvd80767mt0    ACTIVE
   121                  ACTIVE
   267 0gkvd80767mt0    ACTIVE
   379 0gkvd80767mt0    ACTIVE
   380 0gkvd80767mt0    ACTIVE
   385 dcaztk5ucxwd9    ACTIVE

7 rows selected.

Conclusion: Even if the system is already under load, you can tame a database gone crazy. Instance caging is also highly recommended to avoid an overutilization of the cpu resources in a system. In the end it does not really matter if application waits for the operating system scheduler or the resource manager, since the cpu resources are limited. But it has an impact on the whole system if your load is skyrocketing. Especially in a RAC, where the clustering software has to synchronize across nodes. These implications were not performance tested, so that might not be the method to squeeze out the last bit of speed. But there’s another method to play around with.

CPU pooling

When preparing for my recent appearance at DOAG Database 2023 in Düsseldorf I investigated the cpu pooling features of the Oracle Database Appliance. I found it quite interesting, that besides the resource manager you can also bind the database to a specific set of cpus utilizing the cgroups feature from the Linux kernel. This was introduced in the year 2008 and Oracle started to support this kind of resource management with the release of Oracle Database 12.1 in the year 2013.

So let’s see how we can create a cpuset. Since it is part of the cgroups feature, all is managed through the /sys/fs/cgroup peudo filesystem. The corresponding subdirectory is of course /sys/fs/cgroup/cpuset and since it is explicitly allowed to be modified through filesystem manipulations in the best UNIX tradition, the command to create a cpuset with the name loadtest is as simple as creating a directory in the cpuset path. When doing so the direcrory is automatically populated with the relevant files to control that cpuset. In this case I configured it to use the last 2 logical cpus of my vm with the echo command.

% mkdir /sys/fs/cgroup/cpuset/loadtest
% ls -1 /sys/fs/cgroup/cpuset/loadtest
cgroup.clone_children
cgroup.procs
cpuset.cpu_exclusive
cpuset.cpus
cpuset.effective_cpus
cpuset.effective_mems
cpuset.mem_exclusive
cpuset.mem_hardwall
cpuset.memory_migrate
cpuset.memory_pressure
cpuset.memory_spread_page
cpuset.memory_spread_slab
cpuset.mems
cpuset.sched_load_balance
cpuset.sched_relax_domain_level
notify_on_release
tasks
% /bin/echo "2,3" > /sys/fs/cgroup/cpuset/loadtest/cpuset.cpus

So what’s necessary when I want to use that cpuset in my database? There’s a handy parameter for that, named PROCESSOR_GROUP_NAME. Let’s try and modify the parameter to use our freshly configured cpuset and bounce the database.

% sql / as sysdba
SQL> ALTER SYSTEM SET processor_group_name='loadtest' SCOPE=SPFILE;
SQL> SHUTDOWN IMMEDIATE;
SQL> STARTUP OPEN;
ORA-01078: failure in processing system parameters

OOPS! What’s wrong here? The alert log is pretty quiet about this, no message after the shutdown. OK, let’s try SQL*Plus just in case.

% sqlplus / as sysdba
SQL> startup open;
ORA-56729: Failed to bind the database instance to processor group loadtest;
 Additional Information: cpuset.mems is not set
 Location: chkcpuset:3
ORA-01078: failure in processing system parameters

Apparently we need to set the cpuset.mems parameter. This one is for configuring the nodes in a NUMA environment to which the processes should have access to. When going further down the road we would also need to modify permissions and so on to make all that work. The above investigation was just to know what’s all the low-level stuff and I think you can imagine what’s going on from here. Normally you don’t want to create the cpuset manually and so we need a way to configure this persistently and have it restored at boot time. For that we install the libcgroup-tools package (for distributions other than OL/RHEL consult the documentation) which also contains the cgconfig systemd service which of course has to be enabled.

% dnf install libcgroup-tools
% systemctl enable cgconfig --now

To enable and configure the cpuset to use only the last 2 of the provided 4 logical cpus on my virtual machine let’s add the configuration to the file in /etc/cgconfig.conf for just that purpose. Since we want to use that cpuset with our Oracle database we have to specify permissions so that it can access our cpuset.

group loadtest {
    perm {
        task {
            uid = root;
            gid = oinstall;
            fperm = 660;
        }
        admin {
            uid = root;
            gid = oinstall;
            fperm = 770;
            dperm = 770;
        }
    }
    cpuset {
        cpuset.cpus=2,3;
        cpuset.mems=0;
    }
}

With systemctl we can now load this configuration and let’s check permissions and value of the cpuset.mems parameter of the cpuset before and after. Then we try to start the DB again.

% ls -l /sys/fs/cgroup/cpuset/loadtest/cpuset.cpus
-rw-r--r--. 1 root root 0 May 27 10:31 /sys/fs/cgroup/cpuset/loadtest/cpuset.cpus
% cat /sys/fs/cgroup/cpuset/loadtest/cpuset.mems
% systemctl restart cgconfig
% ls -l /sys/fs/cgroup/cpuset/loadtest/cpuset.cpus
-rw-rw----. 1 root oinstall 0 May 27 11:14 /sys/fs/cgroup/cpuset/loadtest/cpuset.cpus
% cat /sys/fs/cgroup/cpuset/loadtest/cpuset.mems
0

% sql / as sysdba
SQL> startup open;
ORACLE instance started.
Database mounted.
Database opened.

Now our database will bind itself to the cpus of that cpuset. Time to test it again. First we check if the system is settled and does not have any significant load. Then we start the load test again and wait for some time.

% top -u 12345 -1 -bn1
top - 14:07:59 up 3 days, 18:17,  2 users,  1 user,  load average: 0.08, 0.24, 0.44
Tasks: 502 total,   1 running, 501 sleeping,   0 stopped,   0 zombie
%Cpu0  :  5.0 us,  0.0 sy,  0.0 ni, 90.0 id,  0.0 wa,  0.0 hi,  5.0 si,  0.0 st
%Cpu1  :  5.3 us,  5.3 sy,  0.0 ni, 89.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  : 10.5 us, 21.1 sy,  0.0 ni, 68.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :  5.0 us,  5.0 sy,  0.0 ni, 90.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

% sql / as sysdba @$PWD/runtest.sql > /dev/null &
[1] 660759
% sql / as sysdba @$PWD/runtest.sql > /dev/null &
[2] 660804
% sql / as sysdba @$PWD/runtest.sql > /dev/null &
[3] 660849
% sql / as sysdba @$PWD/runtest.sql > /dev/null &
[4] 660875
% sql / as sysdba @$PWD/runtest.sql > /dev/null &
[5] 660902
% sql / as sysdba @$PWD/runtest.sql > /dev/null &
[6] 660985

After an hour the load of the machine is at 3.00 with 6 parallel sessions running the workload, but with resource manager enabled. But we can see only the logical cpuS 2 and 3 are running significantly.

% top -u 12345 -1 -bn1
top - 14:07:59 up 3 days, 19:18,  2 users,  load average: 3.01, 3.13, 3.11
Tasks: 533 total,   9 running, 524 sleeping,   0 stopped,   0 zombie
%Cpu0  :  5.3 us, 10.5 sy,  0.0 ni, 84.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  5.3 us, 21.1 sy,  0.0 ni, 73.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  : 95.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  5.0 hi,  0.0 si,  0.0 st
%Cpu3  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

% sql / as sysdba
SQL> select sid, sql_id, status from v$session where username='SYS';

   SID           SQL_ID    STATUS
______ ________________ _________
     6                  ACTIVE
    52 0gkvd80767mt0    ACTIVE
    53 0gkvd80767mt0    ACTIVE
    56 0gkvd80767mt0    ACTIVE
   237 0gkvd80767mt0    ACTIVE
   261 dcaztk5ucxwd9    ACTIVE
   287 0gkvd80767mt0    ACTIVE
   296 0gkvd80767mt0    ACTIVE

What happens, when we disable the resource manager plan?

ALTER SYSTEM SET resource_manager_plan='' SCOPE=MEMORY;
ALTER SYSTEM RESET resource_manager_plan;

After some time, the load increases to more than 6 but is still running on the last 2 cpu threads. The difference here is, that not the database itself lets the workload processes wait for cpu resource usage, but the operating system.

% top -u 12345 -1 -bn1
top - 15:17:30 up 3 days, 19:27,  1 user,  load average: 6.19, 4.36, 3.51
Tasks: 526 total,   8 running, 518 sleeping,   0 stopped,   0 zombie
%Cpu0  :  5.0 us,  5.0 sy,  0.0 ni, 85.0 id,  0.0 wa,  5.0 hi,  0.0 si,  0.0 st
%Cpu1  :  9.1 us, 22.7 sy,  0.0 ni, 63.6 id,  0.0 wa,  4.5 hi,  0.0 si,  0.0 st
%Cpu2  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

So the database thinks itself is running full speed, but the scheduler of the oprating system can handle only one process at a time on a logical cpu. The advandtage of using a processor group here is, that even if the database goes crazy with its processes, the operating system has cpu resources left to handle all other workload. Beware of the fact that in times of cgroups the load average might not be that much of a reliable indicator anymore.

Conclusion

As we have seen, instance caging and cpu pooling give us tools to create boundaries for the database processes. There’s no clear answer to which of the methods you should use by default, as they can be combined and serve different purposes. One advice that can be given is, that you should always use one of the techniques to tame the database processes, since too much of them can produce an overload that can bring your system too the knees. Especially in a RAC environment this can lead to a total outage of the server.

Appendix: Testing method

To create some cpu load I prepared a table where I run some select with an cartesian product that takes ages. To simulate that usage for multiple users I start the SQL commands in the background of a Linux shell, so that we can saturate the resources that are bound to the database.

The table is created as follows:

CREATE TABLE test (X NUMBER);
INSERT INTO  test VALUES (0);
INSERT INTO  test VALUES (1);
INSERT INTO  test VALUES (2);
INSERT INTO  test VALUES (3);
INSERT INTO  test VALUES (4);
INSERT INTO  test VALUES (5);
INSERT INTO  test VALUES (6);
INSERT INTO  test VALUES (7);
INSERT INTO  test VALUES (8);
INSERT INTO  test VALUES (9);
COMMIT;

The select statement that produces steady cpu load loks like:

SELECT COUNT(*) FROM test, test, test, test, test, test, test, test, test, test, test, test;
SELECT COUNT(*) FROM test, test, test, test, test, test, test, test, test, test, test, test;
SELECT COUNT(*) FROM test, test, test, test, test, test, test, test, test, test, test, test;
SELECT COUNT(*) FROM test, test, test, test, test, test, test, test, test, test, test, test;