I was recently engaged to design and implement a VMware vSphere architecture solution for a large company that wanted to virtualize Oracle RAC 11g R2 as the back end of a small number of web based online transactional systems. Nothing unusual about that, I hear you say. Well the interesting thing with this project is that the customer intended to use Oracle 11g R2 Standard Edition (11.2.0.3). Up until this point all of the Oracle single instance and RAC systems I had been involved with virtualizing were using Enterprise Edition licenses. This was only a relatively small project and a small part of a very large Oracle landscape, although it was still business critical. The customer was planning to use this project as the test case for virtualizing Oracle RAC 11g R2 with VMware vSphere, with the aim of potentially migrating a significant number of additional database systems off their existing Intel standalone, IBM pSeries and Sun SPARC platforms (which are and will remain using Oracle Enterprise Edition).
This article won’t provide you with everything you need to successfully implement Oracle RAC 11g R2 Standard Edition on vSphere. But it will give you a good overview and some ideas and considerations that could feed into an Oracle RAC virtualization project. Each customer environment is different and there is a lot of tuning at the OS, hypervisor, storage and network layer in order to provide optimal performance for production tier 1 Oracle database environments (roughly on par with implementing Oracle RAC on bare metal). The best practices used during this design are covered in my article titled Deploying Enterprise Oracle Databases on vSphere.
Oracle 11g Standard Edition is great if you have a database that will fit within the product license limits. Currently it is limited to 4 CPU sockets, without any limits on core count, memory or database size. A comparison of the different editions is available on the Oracle website – Oracle Database 11g Editions. What standard edition lacks in comparison to Enterprise are things like resource controls, which is one area where VMware vSphere can add a lot of value. Standard edition does still offer RAC and Enterprise Manager, two very useful and valuable features if you need high availability and management visibility in your database system. By virtualizing with vSphere you also get a load of infrastructure metrics provided out of the box. Oracle 11g Standard Edition in my opinion provides excellent value and features when compared to most other database platforms, provided you have access to DBA’s that can administer it. Be aware this is not a licensing discussion and you will still have to ensure the CPU sockets are correctly licensed in accordance with your licensing agreement.
Benefits and Requirements
My customer was very interested in receiving all the benefits I had outlined in my post on Oracle RAC in an HA/DRS Environment. They were particularly interested in resource controls, Guest OS isolation, rapid provisioning and non-disruptive upgrades. Performance was also very important given the databases were supporting transactional web sites. Having VMware HA restart a RAC node in the case of physical host failure or OS crash was only a bonus as they would be using the Oracle 11g OCI Client with TAF (Transparent Application Failover) with their applications to ensure that all database connections were transparently re-established on a surviving node.
I was tasked with designing a solution that would support OLTP and batch services on each of 5 production Oracle RAC clusters, and 5 development databases, that must fit within the limitations of the Oracle 11g Standard Edition license. Some of the additional requirements and constraints were as follows:
- Must use existing vCenter 4.1 U1 and vSphere 4.1 U1, as that is the current standard platform and there was insufficient time to upgrade to vSphere 5.0
- Only 1Gb/s Network connections are available, and solution must leverage existing network equipment
- Must be able to separate batch and online processing on different RAC nodes for each database
- Must be able to scale to handle future growth, within the limits of the Standard Edition License
- Must use existing storage platform, however two dedicated 8Gb/s array storage ports are available for the new database platforms
- Being as cost efficient as possible was very important for this project
- As these are production tier 1 databases there will not be any overcommitment initially
- An existing hardened RHEL 5.7 base template would be used as the foundation of each Oracle RAC Guest being deployed
Compute Solution Example
Even though the Oracle Standard Edition licensing restricted the design to 4 CPU sockets the I didn’t want to risk 50% of the cluster resources being lost due to a single host failure. As a result single CPU socket hosts with 12 cores per socket (AMD) were specified, resulting in 4 hosts total for the vSphere DRs/HA cluster. This was sufficient for the database workloads that would be supported by the infrastructure as they would be more resource constrained by memory. This allowed for the cluster to have N + 1 hosts logically, with 2 hosts used for Online and Batch/reporting transactions, 1 host used for development, and 1 host (25%) reserved as failover/maintenance capacity.
As a result of specifying 4 hosts the design could meet the memory requirements of the Oracle database VM’s using 96GB per host and 8GB 1333MHz DIMMs, without the need to go to slower and more costly 16GB DIMMs. 96GB per host also aligned well with the customers vSphere Enterprise Plus licensing for the future when they do upgrade to vSphere 5.0 (96GB vRAM entitlement per CPU Socket). This solution was more cost effective, even with the additional HBA and network ports required and provided a better risk profile for the customer, compared to a two host design.
DRS is configured to be fully automated but at the most conservative setting. This allowed automatic placement and vMotion migrations for maintenance mode only. This was drive by the requirement to not over-commit resources and the time a vMotion migration of large VM’s on a single 1 Gb/s Network took (2 minutes). Each RAC Cluster has a DRS Anti-Affinity rule specified to keep each node apart on a different physical vSphere server.
Network Solution Example
Each Oracle RAC Guest would have 3 vNIC’s (VMXNET3) for Public, Private and Backup Networks. The Private Network was used for the RAC Cluster interconnect and had Jumbo Frames enabled. Separate vSwitches were specified each with redundant physical NIC’s for the three main traffic types, Management/vMotion, Public RAC Network, and Private RAC Network. Only a single RAC Interconnect network was used per cluster, although Oracle RAC 11g R2 supports up to 4. Additional Interconnect networks can be added at a later stage with additional redundancy or throughput. However in most cases the redundancy provided by the vSwitch is sufficient.
If you plan to use VMware Distributed Switches then you should consider the benefits of Network IO control and Load Based NIC Teaming. However if you don’t have vSphere Enterprise Plus licenses, or are planning a vCenter Migration at some point, Standard vSwitches (as used on this example project) will work perfectly well.
Storage Solution Example
Each Oracle RAC Guest VM was configured with the maximum 4 SCSI adapters (1 x LSI Logic, 3 x PVSCSI). OS, App, and RMAN virtual disks and mount points were configured on the first SCSI adapter, CRS/Voting and DB Data, Redo, and Archive Log virtual disks configured on the remaining 3 SCSI adapters. A separate LUN and Datastore was configured per SCSI adapter. Each node having it’s own OS/App Datastore, with the other datastores being shared using the vSphere multi-writer functionality (KB 1034165). In total each RAC Cluster had 7 datastores. Each virtual disk was configured to support clustering features, i.e. eager zeroed thick, while the back end datastores were configured as thin provisioned on the SAN. Each physical host had a single dual port HBA and round robin multi-pathing was used for each LUN. All LUNs are presented to all hosts within the cluster. Dedicated array side storage ports were configured for the Oracle RAC vSphere hosts. The LUN queue depths were modified to ensure optimal performance. The Oracle RAC Guests are configured to use Oracle ASM for all shared disks.
VM, Guest OS and RAC Database Solution Example
I configured each VM with between 1 and 3 vCPU depending on workload requirements, and 16GB RAM. Extensive kernel and OS tuning is implemented to ensure optimal performance, this is over and above the Oracle specified minimum settings. Each DB is configured with 8GB SGA and 4GB PGA, with the SGA backed by Huge Pages in the guest OS. Sufficient memory is reserved per VM to cover the SGA and guarantee the minimum acceptable level of performance. As Automated Memory Management is not compatible with Huge Pages it was not used. Huge Pages has multiple benefits including increased performance and no risk the OS will swap out the huge memory pages. Sufficient swap space per Guest OS is configured to cover the amount of memory that is unreserved per VM.
OS Swapping should only ever be observed if there is a host failure in the vSphere cluster or a process failure within the guest. Guests should be sized and configured to not swap memory during normal operations. Allocating more memory over an above the reserved amount (minimum necessary for acceptable performance) can reduce IOPS on the SAN and provide a performance bonus during normal operations. But this memory may be sacrificed during failure scenarios.
Each RAC Database was configured with the standard 3 SCAN (Single Client Access Name) listeners and the SCAN name and IP’s were all configured in DNS to provide DNS Round Robin load balancing. GNS (Grid Naming Service), was not used as DHCP was not available on the networks where RAC was being configured.
We did not have time to implement much in the way of automation to build the Oracle RAC clusters, so the provisioning process was largely manual. However even so it was largely a manual process a single 2 node RAC cluster could be provisioned from scratch, including storage, VM, OS, RAC, DB, and be functional and ready to test in approximately 2 hours. With additional automation this could have been achieved much more quickly.
Validation Testing and Results
As part of the implementation process you should execute validation tests to ensure that the results are as expected, before any of the real applications are loaded into the environments. For this purpose in my example I used Swingbench v2.3 for stress testing, and to provide load during failure scenario testing. Although this is a synthetic workload and not a valid comparison to the real applications that would be supported by these systems it provided a way to test the RAC and vSphere functionality that was important to the customer. Each of the test cases below was run with a standard connection string (//scan-name/service-name) and a TNS connection string using both the JDBC OCI and Thin drivers.
- Baseline VM IO Test (using IO Meter, single vmdk, Windows VM), used to establish guidelines for latency and IOPS as comparison to later tests.
- Baseline DB stress test (to get a result to compare further results against).
- Stress Test during RAC Node vMotion Migration, vMotion migration took 2 minutes, no connection drops
- Graceful shutdown of RAC node during stress test, services and listeners failed over to surviving node, no connection drops using TNS connection string (TAF) as all connections failed over to surviving node, expected connection drops using standard connection string (No TAF), but surviving connections not impacted.
- Power off RAC node during stress test, services and listeners failed over to surviving node, no connection drops using TNS connection string (TAF) as all connections failed over to surviving node, expected connection drops using standard connection string (No TAF), but surviving connections not impacted.
- Power off Host containing RAC node during stress test, services and listeners failed over to surviving node, no connection drops using TNS connection string (TAF) as all connections failed over to surviving node, expected connection drops using standard connection string (No TAF), but surviving connections not impacted, impacted RAC node restarted in approximately 2 minutes on the surviving physical vSphere host. When the powered off physical vSphere host was rebooted the RAC node was automatically migrated back to it using vMotion (thanks to DRS Anti Affinity Rules) to restore cluster redundancy.
During a RAC node failure, even when leveraging TAF, there is a brief pause in transactions while the surviving nodes execute the failover of listeners and connections. This is generally around 15 to 20 seconds in my experience. Selects will resume where they left off, but any uncommitted in-flight updates, inserts or deletes will be re-executed by TAF from the start (refer to Oracle Documentation for more information on TAF functionality and limitations). This is the same for Oracle RAC when deployed on bare metal OS.
When the customer implements vSphere 5 they will have the option of using multiple VMKernel ports for vMotion to get more than 1Gb/s throughput. With large amounts of memory allocated per VM a 10Gbe network, or using multiple vMotion ports in vSphere 5 can greatly improve the performance of migrations, especially when Jumbo Frames is used.
Conclusion
Oracle RAC 11g R2 Standard Edition offers great value for companies that need the availability and features that are provided, and with workloads that can fit within the license limits and you license the infrastructure where Oracle software is installed and/or run. Implementing on a 4 node VMware vSphere HA/DRS cluster with a single CPU socket per host is a great balance of risk, availability and performance, especially for memory or IO constrained workloads. It is a very good place to start with Oracle RAC. When combined with VMware vSphere you can achieve even higher levels of availability, rapid provisioning, resource isolation, non-disruptive hardware upgrades and maintenance, hardware independence for DR, additional system infrastructure metric insight, and great performance.
Oracle RAC runs exceptionally well on VMware vSphere and since 11g R2 v11.2.0.2 (Nov 2010) is fully supported by both Oracle and VMware. By virtualizing Oracle databases and applications on VMware vSphere you can accelerate and increase your ROI while providing predictable performance and service levels. Don’t hesitate to reach out to me if your organization is considering virtualizing Oracle databases or applications on VMware vSphere and would like some help and experience. There is a contact form on the author page.
For additional information on virtualizing Oracle visit my Oracle Page.
—
This post first appeared on the Long White Virtual Clouds blog at longwhiteclouds.com. By Michael Webster +. Copyright © 2012 – IT Solutions 2000 Ltd and Michael Webster +. All rights reserved. Not to be reproduced for commercial purposes without written permission.
Excellent design. I've long been an advocate of using Oracle Standard Edition when possible since it comes with RAC included and your maximizing the license limitations with 12 cores per CPU was an excellent choice.
I just wish Oracle would change some of their products to allow use of Standard Edition instead of Enterprise Edition – for example, Oracle E-Business Suite requires Enterprise Edition. Bleh.
Question regarding setup just out of curiosity – I know in the past you advocated using elevator=deadline vs the standard CFQ elevator – did you do any testing regarding the performance with different elevators in this environment?
Also, I see you didn't use AMM because of it's incompatibility with huge pages. Did you utilize ASMM (Automatic Shared Memory Management – it was introduced with 10g) to balance the memory between the various sga pools / components and if not, why?
Like I said before, it sounds like an excellent design for leveraging Oracle Standard Edition.
Hi Jay, Re the elevator, we use NOOP, as per my article also on virtualizing Enterprise Oracle Databases on vSphere. I've done quite a lot of testing in virtualized environments with enterprise scale databases and this elevator provides the best performance due to the multiple layers of schedulers and queues that are involved in IO. In my experience getting the IO's to the driver and HBA as fast as possible (with least overhead) offers the best chance of lowest latency and highest throughput. With regard to ASMM, yes I did specify and use that. Within the defined SGA/PGA sizes we allocated Oracle knows best how to use that memory and adjust the different pools when needed. I did quite a few comparison tests running Swingbench 2.3 and 2.4 between using AMM and not using AMM. I found the performance and stability superior with huge pages (no risk of host swapping) and not using AMM. But in all cases ASMM was always enabled.
[…] Oracle RAC 11g R2 Standard Edition on vSphere […]
[…] discussed in my article Oracle RAC 11g R2 Standard Edition on vSphere how you could deploy up to 4 VMware vSphere hosts with a single CPU socket each to run an unlimited […]
Maybe it has been some time since you wrote the article but it seems that Oracle decide to change the rules (again). Just check the last paragraph below.
My question is: even changing the licensing rules, is the rule of 4 sockets still in place? I mean: Can SE be used on a cluster with 4 intel sockets (8 core/socket) but paying the SE licenses for 16 processors?
"Processor: This metric is used in environments where users cannot be identified and counted. The Internet is a typical environment where it is often difficult to count users. This metric can also be used when the Named User Plus population is very high and it is more cost effective for the customer to license the Database using the Processor metric. The Processor metric is not offered for Personal Edition. The number of required licenses shall be determined by multiplying the total number of cores of the processor by a core processor licensing factor specified on the Oracle Processor Core Factor Table which can be accessed at http://oracle.com/contracts.
All cores on all multicore chips for each licensed program are to be aggregated before multiplying by the appropriate core processor licensing factor and all fractions of a number are to be rounded up to the next whole number. When licensing Oracle programs with Standard Edition One or Standard Edition in the product name, a processor is counted equivalent to a socket; however, in the case of multi-chip modules, each chip in the multi-chip module is counted as one occupied socket."
Hi Gustavo, The answer is yes, I did write this a while ago, and that this still applies. In fact I got an email from Oracle just yesterday on this exact topic confirming it still applies. An Intel multicore socket is not the same thing as a multi-chip module. In this case a core is not a processor for the purposes of SE licensing. The multi-chip module is basically a processor board in a mid range system or mainframe class server where you have multiple processors per board. The wording is quite open to interpretation and isn't exactly clear. So this is the understanding that I've had and has been confirmed by various interactions with Oracle to date, and customers interactions and agreements also.
But one thing you have hit on is that Oracle can and has changed licensing rules over time and you should always check with the most recently agreed license agreement contract that is signed and executed by both parties (you and Oracle) and legally binding. Anything other than this fully executed contract is really just here-say. The customer that this article is based on is still quite happily running many RAC nodes and clusters in their 4 node vSphere HA Cluster with each host having one socket.
Do you mind to share Oracle's email? I'm sure that this was an answer for a direct question but I'd like to understand their positioning and the specific question. I think your customer is a great case!
This subject seems an old discussion about chip design. IBM POWER up to POWER5 was a MCM Chip and a few others too. This means that they have more than one chip on the same "package". The same applies for some older INTEL processors.
Since most of the newer chips does not follow this design, the rule still applies.