One of my colleagues ran into some trouble with connecting a vCenter Virtual Appliance to Active Directory for authentication. He was getting a weird error saying that the FQDN (Fully Qualified Domain Name) was wrong. The actual error message was as follows:
Failed to execute ‘/usr/sbin/vpxd_servicecfg ‘ad’ ‘write’ ‘administrator@<domainfqdn>’ CENSORED ‘<domainfqdn>’
VC_CFG_RESULT=302(Error: Enabling Active Directory failed.)
After quite a bit of research the solution was found.
There are a lot of environments that are running MySQL and PostgreSQL to support their systems. My team at Nutanix and I have been getting a lot of enquiries about how to set up these databases for best performance, and customers have also been using them to benchmark and baseline different systems. One of the challenges with these databases is that they give only limited control over where data files and transaction logs can be placed, and this makes increasing parallelism of IO a little bit of a challenge. Your database is just an extension of your storage and all storage devices, even virtual ones, have a limited queue depth that you can work with. Unlike Oracle, SQL Server, Sybase, DB2, etc you can’t just create a whole bunch of mount points and spread your data files over them (which increases available queue depth and potential IO parallelism). But the solution to this problem is made quite simple with Linux LVM (Logical Volume Manager). I’ll take you through some of the steps I took to set up a test VM for MySQL testing with HammerDB and PostgreSQL for testing with PGBench.
I was helping a customer troubleshoot a performance problem with their Oracle 11g R2 (220.127.116.11) database today and we noticed that it was taking between 90 seconds and 120 seconds to complete a checkpoint. Anyone who knows anything about Oracle knows that this is an eternity for a database to wait while switching online redo log files and continuing to process transactions. This was causing a lot of performance dissatisfaction for the customer, which is understandable. But it didn’t take long to figure out what the problem was and fix it. Literally changing a single parameter in the database and restarting the instance turbo charged performance by over 10X.
There is a lot of FUD about Data Corruption, Torn IO, Write Ordering and other aspects when using NFS as a datastore in VMware vSphere, even when the VM’s are configured to use Virtual Disks. This seems surprising, especially given some very large VMware vSphere based clouds are built on NFS storage presented as datastores for use with VM’s, and that for years numerous companies have been running business critical apps on NFS, presented as datastores, or otherwise. Many of you may not know that VMware has actually patented the process for presenting NFS as a datastore to VM’s that use Virtual SCSI disks (US7865663), so that it emulates the SCSI protocol. You also may not know that not all storage systems, even when using block based storage such as FC, FCoE or iSCSI, honour all of the techniques to keep your data safe. A lot of it comes down to individual storage system implementation. Enterprise storage systems that take data protection seriously and implement the appropriate IO protections are all suitable for running business critical apps, even when presenting NFS for use as a datastore to VMware vSphere. So what do you need to know?
Cumulus Linux is the answer to companies that want to run software defined networking on a range of open networks industry standard switches, without necessarily being locked into one physical switch hardware vendor. But unlike network virtualization solutions such as NSX, Cumulus Linux is the Network OS (NOS) for the physical switches, rather than a virtualization layer on top. Cumulus is part of the NSX ecosystem and integrated into NSX, so essentially you can use Cumulus to run on the physical switches and integrate it to NSX to provide the network virtualization (termination and VXLAN switching/routing in hardware also supported on some switches). Cumulus is Linux for network switches, so it’s easy to manage, and very easy to automate. I happen to be working on a project now to build the best practices for Cumulus Linux with Nutanix and VMware vSphere. So I needed an easy way to get Cumulus installed on my lab switches, from my MacBook Pro, which is what the remainder of this article is about.
CloudPhysics have come up with a great Halloween themed report that has some very interesting insights into what Ghosts and Goblins are lurking in virtual datacenters. I was particularly surprised by the 41% of clusters that don’t have admission control enabled. You can get the full report here. I’ve included the infographic here for your enjoyment.
Nutanix Web Scale NoSAN now meets NoDisk. I didn’t know that the band Queen could predict the future of IT when I first listened to their song Flash Gordon. But the lyrics I’ve quoted above seem to suggest they could somewhat predict the future of the storage industry. Flash will undoubtedly have a big impact on IT, even if it is only just starting to penetrate the datacenter now (only a small percentage of total deployed storage is flash). So it is probably no surprise that eventually Nutanix Web Scale Converged Infrastructure platform would include options for all flash. Then on top of that we add Metro Availability, the metro storage cluster type availability that is only a few clicks to set up, and significantly simpler to operate and test compared to traditional metro solutions. So you can have your all flash and you don’t need to compromise on any data services. Of course Metro Availability is just a software feature so is available in any of the Nutanix platforms, it will just take a software upgrade once the new version of the Nutanix OS is available (Available from 4.1). So why all flash?
My colleague Magnus Andresson (VCDX-56 and Double VCDX DCV/Cloud) has put together some short videos showing some example solutions with Nutanix and vCloud Automation Center working together. vCloud Automation Center has recently been renamed vRealize Automation also known as vRA (vee Raa! – intentionally not used in the title). I hope you enjoy these videos and it gives you some ideas of how you can integrate vCloud Automation Center into your solutions with Nutanix.
Another VMworld event is over and it’s hard to believe it’s been a whole 12 months since the last one. Certainly during the keynotes there was a lot of coverage about what VMware has achieved over the last 12 months and it is impressive especially in the end user computing and hybrid cloud spaces. But overall I felt that VMworld USA 2014 lacked some of the sparkle of last year. But I guess it’s hard to top last year considering it was the 10th anniversary. This year seemed much more about building a solid foundation for a software defined datacenter, a software defined enterprise and a hybrid cloud model integrating applications with infrastructure, providing ability and flexibility, but without compromise. Although attendance was flat or a little down on last year the breakout sessions were packed, right up to the last session on Thursday. Instead of having our heads in the clouds this year it was all about the vCloud Air, and we vRealized the product naming is about to be changing. So lets dive into what I think are some of the highlights.
VMware has announced that it will turn off TPS in upcoming version of it’s hypervisor ESXi and vCloud Air hybrid cloud service. This is due to a security bug, considered a very rare possibility and only exploitable in very controlled and largely misconfigured environments. TPS also known as Transparent Page Sharing is a memory management technique that allows multiple VM’s to share a read only copy of the same memory page. When a VM needs to update or write to a page a new copy is created. The idea is that if there are many VM’s with similar memory pages on the same physical host server it will de-duplicate the pages and only store one copy. The result is that you can run more VM’s per physical server while still achieving very good performance.
TPS has for a long time been used as a competitive advantage by VMware over all of the other hypervisors. But realistically it hasn’t been in wide use by most customers for some time (since ESX 3.5) as the amount of RAM per host has increased, because of the use of large memory pages (2MB instead of 4KB) in Nehalem and above processors, and because most customers don’t want to run their systems at 100% utilization so that they can handle bursts of activity. When using large pages TPS only kicked in when systems were over 96% memory utilization, at which point large pages would be broken down into small pages that could be shared. However this has been a popular technique with service providers and with virtual desktop environments, and in some test and development environments, where over commitment of memory may have been acceptable.