I read a good blog article recently about a caveat with SSH keys and Lockdown Mode in ESXi 5 by William Lam at virtuallyGhetto. Now that SSH keys are fully supported in ESXi 5, and this will allow an authorized user to continue to log into the host even when Lockdown Mode is enabled, is Lockdown Mode really locked down enough?
With vSphere 4.x if you enabled Lockdown Mode through vCenter it couldn’t be disabled through the host, even though an administrator could still log into the DCUI. The only way to have the ability to disable Lockdown Mode through the host was to have enabled it through the host in the first place. Unfortunately enabling Lockdown Mode through the host has the effect of removing any locally defined users and groups.
This original behavior of Lockdown Mode could be a pain for an administrator that had lost access to his/her vCenter, and perhaps it was running on the host that was in Lockdown Mode. At that point it was almost as good as Total Lockdown Mode (No DCUI Access), and could have resulted in a host rebuild, except for the fact that you could always reboot the host and hope HA recovered the vCenter server (no good if all the servers were down, or HA can’t restart vCenter). This was good for security, but not so good if you lost your vCenter and didn’t have any way of getting it back (I learned this the hard way, but fortunately found a way out of it). When I came across this situation in the real world I was able to mount the vCenter Server storage and register the vCenter into another host, power it on, and then as soon as I had access to it, disable lockdown mode till the incidents were resolved. This was only possible thanks to iSCSI storage, as I had to bring it up at the DR site.
There has been a change to the way Lockdown Mode works recently, and it’s important to understand it, as it’s likely most people will now want to use it. Since the latest updates in vSphere 4.1 and also in vSphere 5, regardless of where Lockdown Mode is enabled, it can be disabled for troubleshooting purposes from DCUI. This is an emergency stop gap measure in case you loose access to vCenter. This would require physical access to the host or access via the hosts remote management card, in addition to knowing the username and password to access the DCUI.This is a great enhancement for the majority of customer environments and will allow a greater level of security without hindering troubleshooting when things go wrong. However there will be some environments where Total Lockdown Mode is now more appropriate. The old rule of if you enable Lockdown Mode through the DCUI you loose the host users and groups that were defined, so it’s always best to enable it in vCenter.
Don’t get caught out if you see the Change Lockdown Mode settings grayed out in the DCUI for a host if it’s not connected to vCenter (disconnecting a host from vCenter has this effect). This is expected and by design. Lockdown Mode is only applicable to hosts that are connected to vCenter.
If you want to allow access to the ESXi hosts directly from a management jump box, then using SSH keys would be the way to go. This access method should only be used for troubleshooting however. For normal day to day administration I would strongly recommend that vCenter and the vMA are used to run the RCLI commands against the hosts. This is the easiest way to administer the hosts will keeping good audit records of activities. Although if the hosts are AD integrated, and the logs are going to syslog you will also get a record of every shell command in the syslogs executed by every user.
So based on this do you think Locked Down Mode is Really Locked Down enough? It would be good to get your input to this discussion.
This post first appeared on the Long White Virtual Clouds blog at longwhiteclouds.com, by Michael Webster +. Copyright © 2012 – IT Solutions 2000 Ltd and Michael Webster +. All rights reserved. Not to be reproduced for commercial purposes without written permission.
I understand about lock down. But here is a situation that recently occured with our Vcenter 4.1 and ESX 4.1 server. We are running Lab Manager and for some reason the consoles on the LM configurations were unable to communicate with the esxi host. The error indicated that it failed to communicate to the vcenter host on port 902. Investigated this error and found several articles but none proved to be helpful. Contacted VMWare support and we enabled Tech Support mode and SSH for the esxi server. Had support look at the server. They attempted to restart the services, but this failed. Their recommendation was to reboot.OK so we rebooted. When the system returned we were unable to get to the system from ANY console means. The physical console was locked out as well as any services. We essentially had a 'brick' for a server and no access to anything. Have not found any references to this type of failure. Our only recourse was to reinstall ESXi and reconnect to the Lab Manager service. A royal pain when you have hundred of VM's that are now orphaned and inaccessable. VMware support was clueless as to why this occured.
Hi Peter, That's very unfortunate. I think a bit more troubleshooting would have been useful prior to the reboot. Did you have a TAM involved in this process? There are a few different possibilities that could have caused those symptoms. The thing with lockdown mode in 4.1 is that only vpxuser can access the host. So you need to make sure the host is accessible to vCenter at all times as this is the only way to disable lockdown mode. When troubleshooting lockdown mode may need to be disabled so you can ensure host access when necessary. If your vCenter can't communicate to the host that is the first problem to fix before anything else.
From my testing I've found that not only if you enable lockdown mode from DCUI you lose local permissions, but if you DISABLE lockdown mode from DCUI you also lose host level permissions. Then vCenter still thinks the host is locked down, and you can no longer set lockdown mode on or off through vCenter. Not very helpful at all.