vSphere Upgrade Saga: Finally vSphere 6.7

It was time. I finally found the time to upgrade my environment to vSphere 6.7U1 from vSphere 6.5U1. The vSphere Upgrade Saga continues. No, I did not go to 6.5U2 or even vSphere 6.7, as I was waiting for the tool to allow me to converge my external PSC into an embedded PSC. Most things went quite well. There were a few upgrade issues, however.

As always, follow the proper upgrade steps as stated in KB53710. This has always been crucial part of the vSphere Upgrade Saga.

First Problem

vSphere 6.7 upgraded quite well until I got to the external PSC. The update required me to log in to the Platform Controller as root and enable the shell, then use:

chsh -s /bin/bash root

Otherwise, the upgrade would fail, not being able to log in. Once I completed the above step, I had no problems completing the external PSC upgrade.

Second Problem

When it came time to upgrade my vCenter Server, everything worked well except during the second stage of the upgrade, when all data was to be copied over. The original vCenter Server did not have enough space in the / partition. To fix that, I used the following commands after using the shell option to get to root:

rm -rf /var/log/*.[0-9]*.gz /var/lig/*.20*.gz
journalctl --vacuum-size=250M

Third Problem

vSphere 6.7 provides a method to converge an external PSC to an embedded PSC. The convergence of my external PSC into an embedded PSC went without a hitch once I specified the proper host hosting the vCenter Service Appliance. The problems started once I went to rehook my services into the embedded PSC, such as NSX, etc. I was following Emad Younis’s wonderful description of the process.

NSX

The issue was that the vSphere 6.7 lookupservice/sdk certificate was not the current certificate. vJenner had a good writeup on how to correct it. Trying to get the current fingerprint of the older certificate did not work for me, so I went directly to the site and got it that way via the browser’s certificate view functionality. Either should work.

I had to unregister vRealize Infrastructure Navigator (com.vmware.vadm, com.vmware.vadm.ngc51, com.vmware.vadm.ngc60) and VMware Data Protection (com.vmware.vdp2, com.vmware.vdp2.config) from the MOB. For good measure, I also disconnected HPE OneView for vCenter from vCenter and then rebooted vCenter. This was to fix the ls_update_certs.py command per KB2150057. There was a little trial and error. Not only did I have to unregister them from the MOB, but I also had to unregister them from the lookupservice itself. The following code helped:

for x in com.vmware.vadm com.vmware.vadm.ngc51 com.vmware.vadm.ngc60 com.vmware.vdp2 com.vmware.vdp2.config
do
  y=`python lstool.py list --url https://vcenter.server/lookupservice/sdk 2>&1 |grep $x|grep "Service ID" | awk '{print $3}'` python lstool.py unregister --url https://vcenter.server/lookupservice/sdk --no-check-cert --id $y --user USERNAME --password PASSWORD done # New cert is located in /root/Cert/new.crt2 python ls_update_certs.py --url https://vcenter.server/lookupservice/sdk --fingerprint <old_fingerprint> --certfile /root/Cert/new.crt --user USERNAME --password PASSWORD 2>&1 | tee /tmp/certificate_manager.log

If I did not use the tee command, the output ended up being too dense to parse and made it impossible to find what had to be removed from the extensions within the MOB. Your list of problem extensions may be different than mine. Mine are purely historical, as those tools no longer work with vSphere 6.7.

Then a new problem related to KB2121689 occurred. Once the proper certs were updated, everything connected properly. My problem was that I had three trust anchors. Two of them were the same but incorrect. The KB article helped me to sort them out so NSX Manager would connect properly.

vRealize Operations

I ended up having duplicate vROPS certificates. I needed to remove the old ones and then the vCenter Adapter and recreate it. In vROPS 7.0, there is now a certificate manager within the GUI, so I used that to delete all the older certificates related to my converged vCenter Server. I cleared my browser cache and recreated the vCenter Adapter. This should be repeated for any other vCenter-related adapters, such as VSAN, Service Discovery, etc. Problem solved.

vRealize Log Insight

This was simply accepting the new certificate during the test of the connection to vCenter.

vRealize Network Insight

This was simply reusing the same user and resetting the password to reestablish connectivity. In reality, that is the process I followed, but I was not asked to approve any certificates, either, so I am not sure there was a break in communication. This was a sanity change.

Horizon View

Horizon View was much different. I could not get the View Connection Broker to accept the new certificate. I was not all that concerned about the desktops, so I went into ADSI Edit and removed the OU=VirtualCenter representing my vCenter Server, the OU=Data Disks representing my desktops, and the OU=Applications and OU=Server Groups entries for the old desktop pool. Then I went into vCenter and removed all the old VMs. There were not many of each, so it was pretty simple. Once I did that, I was able to reconnect to vCenter, recreate the desktop pool, and then recreate the desktops.

I am sure there was something I could have edited in the OU=VirtualCenter properties for my vCenter Server, but I could not find it. Deleting and starting my desktop pool over fresh seemed like the easiest approach. With lots of desktops, I would not have done this. At scale, I would have found a different approach.

If I did not remove the OU=Data Disks representing my old desktops, I got a java error about dn=null. That would not do, as it kept everything from working.

Site Recovery Manager

This required me to disable UAC (there is a regedit that was required) in order to change the configuration. However, I did not change the configuration. I removed the older version and installed the latest, as I was not at the time using SRM and had not in the past. However, for older installs using SRM, you will need to reconfigure the PSC configuration.

HPE OneView for vCenter

Since I originally unregistered HPE OneView for vCenter in trying to fix NSX, I had the opportunity to upgrade OneView for vCenter and then reregister it with vCenter. The upgrade and reregistration went smoothly, and all works as expected, even in the HTML5 client. I also upgraded the HPE OneView and Global Dashboard components at the same time. HPE Global Dashboard had some particularities, with the need for an AD Group to be defined and my user placed into it. It also had issues with the expired Domain Controller certificate. This in turn required me to deploy Microsoft AD Certificate Services on another server to update that certificate. Once that was done, I was able to configure AD within HPE Global Dashboard and update the certificates on HPE OneView as well. Certificate management is becoming more of an issue that many realize these days. That is a discussion for another post.

Veeam Availability Suite

Veeam Availability Suite was similar to vRealize Log Insight. All I needed to do was to reconnect the servers (Veeam One, Veeam Backup and Recovery) using the new certificate. Next, I followed KB2784 to work around an API issue. Then, backups started happening once more. I also took this opportunity to upgrade Veeam Availability Suite to the latest updates. Veeam One showed many older alerts that were not cleared after the update. I went through and cleared them. Then a bunch of low disk and other smaller errors appeared. One was with my Horizon View installation. I fixed the disk space and other errors while I was about my updates.

Fourth Problem

The vSphere Replication upgrade had major issues. The upgrade failed due to the fact that the extension service was not registered (I did not remove it). Further, the Solution User already existed. Registering the original was something I could make work over enough time. Still, I did not actually use vSphere Replication, so the best approach was to remove the old unregistered version, reinstall, and reregister the appliance. The steps were:

  • Deploy VR 8.1, as 8.1.1 would not boot to a point where the VAMI could be accessed.
  • Remove the existing VR user <== This was the major problem.
  • Remove the extension com.vmware.vcHms via the MOB (if it was added; I tried to save and register multiple times, so it was installed).
  • Set up the VAMI configuration and configure the appliance to use my embedded PSC and to properly register the extension.

Fifth Problem

Site Recovery Manager had serious issues. It required UAC to be fully disabled in order to upgrade. I dislike disabling security controls, but you can reenable after the upgrade takes place.

Conclusion

Upgrading to vSphere 6.7 takes planning; however, even with good planning, there are often issues. These are just some of the issues and decisions I had to make to upgrade my small environment, with links to appropriate articles and KBs to fix common problems. NSX was my main stumbling block after I converged the external PSC into an embedded PSC.

Join the Conversation

  1. Edward Haletky

3 Comments

  1. Great help, we spent 1/2 hour trying to get the PSC upgraded and the tool couldn’t connect. Searched and found this page and read “First Problem” with the PSC not dropping in to a shell on login. That was not mentioned anywhere in the VMware document. Good catch and saved us a lot of time.

    1. Glad I could assist. That was a surprising problem that I so glad I was able to resolve myself! I hope the rest of your upgrade goes well.

Leave a comment

Your email address will not be published. Required fields are marked *

I accept the Privacy Policy

This site uses Akismet to reduce spam. Learn how your comment data is processed.