8 Responses

  1. nate
    nate at |

    Have you seen TPS's effectiveness on Linux ? In my experience it's basically worthless. I don't disable it because it doesn't really matter much(and I'm not CPU bound on any workload) but I see typically sub 5% savings with TPS on vSphere 4 at least and Linux. Consistently across clusters, hardware, linux versions and vmware versions (and companies as well). Much different case than with windows.

    Even with two VMs running the same OS, running the same applications taking the same traffic from a load balancer the amount of memory that TPS reclaims is startlingly tiny with Linux.

    Looking at one of my servers which has 192GB of memory, has 17 VMs on it, 99GB of host memory in use. 15 of the 17 VMs have the same OS/version – all are Linux(Well except one, I see my vCenter is on there too with win2k8). TPS savings on this box are 4,806 MB.

    – 4 VMS run the same app "app 1"

    – 3 VMs run the same app "app 2"

    – 2 VMs run the same app "app 3"

    – Rest of VMs on the box run other various apps.

    – The largest consumer of memory is a MySQL DB which has 32GB of memory

    One would think that at least the common OS, and the VMs running the same apps the amount saved would be higher but for whatever reason it's not. This trend goes back at least to 2009 if not earlier(memory foggy on this topic prior to that)

    I don't swap on any of my systems. Last thing I want is for VMs to be actively swapping taking up valuable I/O on my storage. I give my VMs standard 512MB of swap in case of emergency, but if they have to swap more than that I'd rather have them fail and crash then thrash my storage.

    I used ESX 3.5 and Oracle a bunch at one company back in 2007, it worked very well, and for a while allowed us to slash our licensing by going "unsupported". 2007 was I believe the first release of the Intel quad core processors, so we had deployed a few DL380G5s with single socket quad cores. Combine that with Oracle SE we split our ESX licenses(which came in packs of 2 sockets at the time) between systems. vmware didn't "support" single socket systems(I don't think I ever had to reach out to vmware support), and Oracle didn't support running on vmware (again no tickets filed to oracle either). It obviously dramatically increased our flexibility and slashed our Oracle licensing vs our previous model of each database instance running on separate physical hardware (typically on a dual proc single core system).

    I haven't been at a company since that uses Oracle – though I do miss it, MySQL just doesn't come close in so many areas, operationally MySQL is much more complicated to support and troubleshoot.

    For production we stuck with bare metal mainly since there was no real compelling case to not use bare metal to override the lack of support we might get from Oracle, and incurring the ESX licensing fees on those boxes.

    Reply
    1. @vcdxnz001
      @vcdxnz001 at |

      Hi Nate, Firstly thanks for the comment, it's great to have this discussion. I am aware of the effectiveness of TPS in general and especially on Linux and Windows in different environment, with different hardware, and different versions of ESX/ESXi. The effectiveness of TPS really isn't the point of the article. The point is there is a big danger if you disable it, which can be disastrous. The dangers of disabling TPS should not be downplayed as it can have a big impact not just on the VM or host where it has been implemented but also the rest of the environment. This is because it eliminates TPS as a possibility and this may cause unnecessary host swapping in the case of memory contention, which will in turn cause a big impact on storage, as you've pointed out. We don't want any swapping at any level (again as you've pointed out), but disabling TPS eliminates one of the options that may prevent some host swapping in situations of host memory contention.

      vSphere 4 when combined with EPT enabled hardware will back VM's with large pages by default and will not break those pages until there is resource contention. This will have the appearance of higher host memory utilization and lower TPS savings. That is until there is resource contention. NUMA also has implications for TPS, but the hypervisor is smart enough to deal with it (vSphere 4+). vSphere 5 has a number of enhancements in the area of memory management and TPS in particular that you will notice once you start deploying it.

      For production applications and Oracle DB's in particular there should be very little to no benefit from TPS during normal operations because I try to right size all the VM's for optimal performance and expect the VM's to use all of that memory and resources assigned. In addition I'd normally be using huge pages in the OS, and tuning the kernel for optimal performance. I will most likely be providing guaranteed service levels and resource assignments to these VM's by way of a minimum level memory reservation and where required providing 100% memory reservations (for Tier 1 apps). But there are still situations, such as host failures, and other unintended situations that may cause the need to enact various memory management techniques, and you want to have TPS available as an option. Not the resource guarantees apply generally to production and tier 1 environments and would not generally be used in non-production environments where higher overcommitment levels are desired and acceptable.

      I designed and deployed a large financial system on ESX 3.5 that required 800 Linux VM's. It wasn't entirely smooth sailing for a number of reasons and we had to do a lot of testing and tuning to get the best possible performance, but it was incredibly successful and none of the tuning required disabling TPS. My goal wasn't to use TPS, my goal was to get the best availability and performance possible that met the business requirements, which I achieved. This was a very demanding environment dealing with billions of dollars, and very leading edge at the time. I used TPS as a bonus only for the regions of memory that we knew we could sacrifice and we sized everything accordingly.

      Oracle DB and Applications, Including Oracle RAC are fully supported on vSphere (see my other articles below for details) and there is no licensing penalty if you design your environment optimally. In many cases there are license savings. The benefits of virtualizing production systems on vSphere are very compelling, and aren't just limited to hardware savings or efficiencies. Significant opex savings (including Admin time), SDLC / Testing lifecycle saving, DR savings and simplified DR, reduced complexity and greatly increased efficiency. Hardware independence is a big benefit even if you don't leverage SRM, you can restore your VM's anywhere that supports the hypervisor. Testing savings alone can be extremely significant to big projects.

      You may like to read these other Oracle on vSphere related articles that I've posted on this blog:
      http://longwhiteclouds.com/2011/11/22/deploying-ehttp://longwhiteclouds.com/2012/01/22/oracle-rac-http://longwhiteclouds.com/2011/08/28/oracle-rac-

      Reply
  2. nate
    nate at |

    yeah that is true that disabling TPS takes away one thing that vmware can use to prevent host/guest swapping, I guess I was just saying at least in my case it really wouldn't have much of a noticeable effect.

    Something I'd love to see, something I've written about on more than one occasion at least with regards to linux is to have the memory balloon driver specifically target memory that is used for buffer cache since by default linux uses all memory it can for the cache, frequently this cache isn't even needed but the guest OS isn't aware if it freed the memory then other guests could benefit.

    This doesn't apply much to Oracle systems since oracle tends to be configured with direct i/o and a large enough memory footprint that there isn't much left behind for anything else.

    So oracle is fully supported in vmware now ? I haven't looked in a while. Last time I saw they said they would try to help but may insist the customer replicate the issue on bare metal. This was a few years ago though.

    Looking here it seems as if vmware provides support for Oracle internally so rather than going to Oracle direct customers can go to vmware –
    http://www.vmware.com/support/policies/oracle-sup

    good post though!

    Reply
    1. @vcdxnz001
      @vcdxnz001 at |

      Thanks Nate. You are quite correct VMware provides a single point of contact for all Oracle support incidents and will coordinate with Oracle to resolve customers issues on behalf of customers through it's special support relationship.

      There is a way to achieve the balloon driver just targeting the filesystem buffer cache, and that is by using huge pages within the guest as the OS can't then swap that memory as it is locked by the process that is using it. This assumes that your software supports huge pages. But this is an effective mechanism I have used in the past where I wanted to have the filesystem cache to improve performance but know that it would be the first and hopefully only target of ballooning. I then protected the memory using huge pages by setting a reservation on the VM to cover the core Java/Oracle processes and a little room for standard OS processes also. This worked well.

      Reply
  3. So Say SMEs in Virtualization and Cloud: Episode 24 Transparent Page Sharing (TPS) | ServerGround.net

    […] Michael Webster's TPS post. […]

  4. Fight the FUD – Oracle Licensing and Support on VMware vSphere « Long White Virtual Clouds

    […] For guidance and ides for design and architecture for your Oracle Databases on vSphere here are two great articles. Deploying Enterprise Oracle Databases on vSphere, Blueprint for Successful Large Scale Oracle Virtualization on vSphere. […]

  5. Fact or Fiction: Does TPS and/or Mem Reservation Help/Hurt Oracle DB Performance | VMware vSphere Blog - VMware Blogs

    […] of TPS on Oracle DB virtual machines has been discussed in many place (reference: here, here and here).  Yet I still hear from customers and partners, most recently at VMworld in San Francisco, that […]

  6. » Virtualized Oracle Databases on UCS Long White Virtual Clouds

    […] time ago I wrote an article about EMC’s Blueprint for Successful Large Scale Oracle Virtualization on vSphere. Now Cisco IT has published a similar whitepaper and study after having virtualized a large number […]

Leave a Reply