vSphere 5.1 – Train Wreck in Slow Motion

vSphere 5.1 arrived this summer to no great fan-fare. We waited a few weeks, heard no sounds of howling pain (we did not listen very hard, I guess), and decided to proceed with upgrading vCenter.  I have been digging out of the wreckage ever since.

How do you know if upgrading to vSphere 5.1 is right for you?  Here are a few bullet points to help you decide:

  • Do you have CA-signed (externally trusted, or in-house Enterprise CA server) certificates in use in your current vSphere environment?
  • Are you using an external MS SQL Server to host your vCenter database?  Are you using mirrored SQL databases?
  • Is your environment currently stable and reliable?

Is you answered “yes” to any of these questions, do not upgrade to vSphere 5.1.  At least, not yet. Do deceive yourself that that the vSphere 5.1.0a release will be any help, either.

What is the big problem, you ask?  The major source of pain in this release is the new “Single Sign-On Service” that handles authentication and authorization for all of the other vSphere components.  This component of vSphere has twitchy SSL certificate requirements that are poorly documented by VMware.  The SSL requirements are so touchy that in our case, even the self-signed certs generated by the installer did not work.  Unlike all of the other current vSphere components, it does not support mirrored SQL databases.  It has new permissions requirements in AD that are not documented at all, and at the time of our installation, did not even have a KB entry.  The installer is very buggy, most notably in that it requests that you set and admin password for the SSO Service, and demands password complexity, but it does not inform you when your password is unacceptably long (i.e. longer than 32 characters) or when your password contains illegal characters (i.e. most regular expression special characters).

So, if you do upgrade, be prepared for an extended service outage.  Give yourself a long service window.  Have your VMware support contract numbers handy.  Familiarize yourself with the myriad of locations that are used to log vCenter data.  Learn to use PowerShell (get-childitem -recurse | select-string -pattern “configSettingThatThevCenterInstallerBorkedUp”) and keep this page bookmarked:

http://derek858.blogspot.com/2012/09/vmware-vcenter-51-installation-part-1.html

Here are UVM we are indebted to Derek Seaman for his thorough documentation of the vSphere 5.1 installation process and detailed SSL certificate generation instructions.

Following are some installation quirks that we encountered, presented mainly for my own reference, but maybe you will find them useful as well:

  1. “Performance Charts Experienced an Internal Error” seen in the vSphere client after the upgrade:
    This happened because vCenter Web Services did not read the database mirroring configuration from our defined ODBC data sources… it grabbed the primary database only, and not the mirror data.  The fix?  Edit:
    “%ProgramData%\VMware\VMware VirtualCenter\vcdb.properties”
    Find the “url=” line, and append:
    ;failoverPartner\=[mirrorServer]
    (Where [mirrorServer] the the actual DB mirror host name.  Don’t forget the “\” before the “=”.)
  2. Some users with permissions to vCenter 5.0 cannot log in after the upgrade.  In the vSphere web client, these users are marked as “disabled”:
    This occurred for use for two reasons:

    1. The SSO Service installer prompts us for a service account to use during install.  Following installation, the service is seen to be running as “SYSTEM”, and not the specified service account.  Change the Service to run with your planned service account using services.msc after the installation.  As an alternative, you could specify those credentials  in the vSphere Web Client -> Administration ->Sign-On and Discovery -> Configuration -> Identity Sources.  Edit your identity source, and under “Authentication Source” select “password”, then enter your service account credentials.
    2. The SSO Service needs to read account attributes that cannot be read by a standard user account (at least, not in an AD forest at a Server 2008 R2 functional level).  When we asked VMware support to define the required permissions, they replied: “an account has to have at least read-only permissions over the user and group Organization Units furthermore read permissions also on the properties of the users, such as UserAccessControl.”  After some experimentation, I just gave the SSO Service account “read all properties” rights to the account OU, and login abilities were restored.
  3. Our SSO Service broke when the mirrored database servers that we currently use for vCenter services had a failover event.  During install, I used the standard “failoverPartner=” JDBC connection string property to specify our failover database server.  Unfortunately, the SSO service ignores this property.  I could not identify an acceptable workaround for this problem. Ultimately, I installed a SQL Express instance on our vCenter server to house just the SSO database.  I tried:
    1. Using SQL Aliases, but this failed because the JDBC driver is not aware of SQL Aliases.
    2. Using a script that edits the local “hosts” file on a database failover event.  I then used this host name alias for the database connections.  This almost worked.  I edited the following files to use the host alias, instead of the actual database server host name:
      %ProgramFiles%\VMware\Infrastructure\SSOServer\webapps\ims\WEB-INF\classes\jndi.properties
      and:
      %ProgramFiles%\VMware\Infrastructure\SSOServer\webapps\lookupservice\WEB-INF\classes\config.properties
      Upon restart, the SSO Service was able to connect to the database, but it did not survive a failover.  Apparently the old database connection information was still in use somewhere, and VMware support was not helpful in identifying all of the database configuration locations for SSO.
    3. While VMware does have command line configuration tools that could have been used to script reconfiguration of the database connection strings, I have deemed that they are too fragile for production use.
  4. The option to authenticate using Windows session credentials in the vSphere Client (traditional version) stopped working after the 5.1 upgrade.  This is a bug that is fixed with the 5.1.0a release.  Unfortunately, the SSO installer for 5.1.0a does not work in upgrade mode.  Aargh!  I had to uninstall the SSO service to get the updated files into place.  Guess what the uninstaller does?  That’s right… it erases the SSO Service database (drops all tables!  Gah!), and deletes all configuration files for the service.  Before you upgrade, make sure that you have an SSO Service backup bundle.  I did, but it was outdated.  I had to re-register all of the vCenter components with SSO manually, which was a pain in the butt.
  5. vSphere Update Manager registered with vCenter using the wrong DNS name.  We could not scan ESXi hosts for updates, because vCenter was telling them to connect to an invalid URL.  To fix, I needed to search the registry for the incorrect host name, and replace with the correct one:
    “HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\VMware, Inc.\VMware Update Manager\VUMServer”
    For good measure I also edited:
    %Program Files(x86)%\VMware\Infrastructure\Update Manager\extension.xml
    To contain the correct host name.  Then we restart the Update Manager services, and we are back in business.
  6. Other fun related to VMware Update Manager… the SQL Account used by Update Manager cannot have a password that exceeds 24 characters in length. Special characters in the SQL Account password also may cause problems.

So, VMware is not my favorite company this month.  On to solve more problems.  We still cannot add new permissions to vCenter, and Performance Charts are loading like a slug in winter.

Comments are closed.