Category : How-To

VMware SRM 6.1 – Configure Array-Based Replication

Introduction

 

This how-to will walk through the installation and configuration of array-based replication features for VMware Site Recovery Manager 6.1.

Before configuring array-based replication for use with VMware SRM, there are some pre-requisites.  First of all, you’re going to need to visit the VMware Compatibility Guide, which will help you determine if your specific array vendor is supported for use with SRM.  Second, there are steps to take to configure array based replication on the storage side, and that portion is out-of-scope for this blog, as I did not have access to do so.

vmware_hcl_example

There are several ways to search the compatibility guide, but to be specific, you can select entries from the areas highlighted above.  The bottom section that is highlighted will be your results once you click “Update and View Results.”  The reason why I wanted to point this step out is because if you assume your array vendor is supported, and don’t verify first, you could end up wasting your time planning and designing.

For this example, we are using SRM 6.1 with the Fibre Channel protocol on IBM SVC-fronted DS8K’s in both sites. I wanted to point that out because when I first set out to find the SRAs for use with our solution, I attempted to use the “IBM DS8000 Storage Replication Adapter”, later to find out it wasn’t the correct one.   The correct SRA for use with my environment is the “IBM Storwize Family Storage Replication Adapter”, so there may be a little bit of trial and error with this; however, if you do it up front during testing, you’ll save yourself some time later when deploying to production.

That all said, once you’ve verified your storage is supported, and what version of the SRA to download, you can get it by visiting the VMware downloads (you will need to login).  Be sure to also verify that the version of the SRA you are downloading is compatible with the version of array manager code you’re running.

 

Installing the SRA

Before you Begin – Prior to installing the SRA on the SRM server in each site (protected and recovery), you should have already paired the sites successfully.  Also, if you haven’t installed SRM yet, you will need to, otherwise the SRA installer will fail once it discovers that SRM is not installed.

Installing the SRA should be straightforward and painless, as there are not many options to configure during installation.  Once the installation is completed on both the protected and recovery SRM servers, proceed.

 

Verify That SRM Has Registered the SRAs

  1. Once you’ve installed the SRA on each site’s SRM server, log into the vSphere Web Client, and go to Site Recovery > Sites and select a site.site_recovery_sites_sra_monitor
    From this view, you can see what SRA has been installed, its status, and compatibility information.
  2. Click the rescan button to ensure the connection is valid and there are no errors.srm_sra_rescan_button

Configure Array Managers

After pairing the protected and recovery sites, you will need to configure the respective array managers so SRM can discover replicated devices, compute datastore groups, and initiate storage operations.  You typically only need to do this once, however, if array access credentials change, or you want to use a different set of arrays, you can edit the connections to update accordingly.

Pre-Requisites

  • Sites have been paired and are connected
  • SRAs have been installed at both sites and verified

Procedure

  1. In the vSphere Web Client, go to Site Recovery > Array Based Replication.srm_abr_settings_1_1
  2. On the Objects tab in the right window pane, click the icon to add an array manager.srm_abr_settings_1_2
  3. Select from one of two options for adding array managers (pair or single), then click Next.srm_abr_settings_1_3
  4. Select a pair of sites for the array manager(s), and click Next.srm_abr_settings_1_4
  5. Enter a name for the array in the Display Name field, and click Next.srm_abr_settings_1_5
  6. Provide the required information for the type of SRA you selected, and click Next.srm_abr_settings_1_6
  7. If you chose to add a pair of array managers, enter the paired array manager information, then click Next.srm_abr_settings_1_7
  8. Click-to-enable the checkbox beside the array pair you just configured, and click Next.srm_abr_settings_1_8
  9. Review your configuration, then click Finish when ready.srm_abr_settings_1_9

 

Rescan Arrays to Detect Configuration Changes

SRM performs an automatic rescan every 24 hours by default to detect any changes made to the array configurations.  It is recommended to perform a manual rescan following any changes to either site by way of reconfiguration or adding/removing devices to recompute the datastore groups.  If you need to change the default interval at which SRM performs a rescan, you can do this in the advanced settings for each site, editing the storage.minDsGroupComputationInterval advanced setting:

srm_abr_settings_1_11

To perform a manual rescan after making any configuration changes:

  1. Go to Site Recovery  > Array Based Replication
  2. Select an array for either site
  3. On the Manage tab of the selected array, click the Array Pairs sub tab
  4. Click the rescan button to perform a manual rescan.srm_abr_settings_1_10

 

Once you’ve got all of the above configured, you can begin setting up your protection groups and recovery plans.

Share This:

Zerto: Perform a VPG Move (VM Migration)

In a situation where a workload needs to be migrated from a protected to a recovery (or site A to site B) in an effort to change where the production workload runs from, you can perform a VPG move.

From what I’ve seen, in terms of VPG move versus Failover, is that when using the Failover option, there is an assumption that the protected site has failed, so systems may not automatically be cleaned up on the protected site.  When performing a move, the protected site is cleaned up as soon as that move is completed and committed unless you select to re-protect the workload in the other direction (can be automatic or manual for commit, maximum time you have to do it is 24 hours, and that is configurable).

One recommendation I have here is that before you perform these steps, perform a recovery test on the VPG you’d like to move to ensure that recovery steps are completed as expected, and that the system is usable at least in a testing capacity.

  1. Log in to the Zerto UI
  2. From the dashboard screen, go to Actions > Move VPG.zerto_perform_vpg_move_1_2
  3. Select (tick the checkbox) for the VPG you want to move, and click Next.zerto_perform_vpg_move_1_3
  4. Select your options for the Execution Parameters, and click Next.  For this example, I will select “none” for the commit policy, to demonstrate where to commit the migration task when you are ready to.zerto_perform_vpg_move_1_4
    > Commit Policy: Auto-Commit - you can delay up to 24 hours (specified in minutes), or select 0 
    to automatically commit immediately when the migration process is completed.
    > Commit Policy: Auto-Rollback - You can delay up to 24 hours (specified in minutes), default 
    delay is 10 minutes
    > Commit Policy: None - You must manually select whether or not to commit or rollback, based 
    on your results.
    > Force Shutdown - Use this in the event VMware Tools isn't running, therefore, allowing an 
    automatic shutdown. Force shutdown will first attempt to gracefully shut the VM down, and if that doesn't work, 
    it will power off the VM on the protected site.
    > Reverse Protection - This will automatically sync changes from the recovery site back to the 
    protected site in case you want to be able to re-protect a system after a migration. This eliminates the need 
    to have to re-initialize synchronization in the other direction. If reverse protection is selected, a delta 
    sync will take place to re-protect after the migration is completed. Caveat - You cannot 
    re-protect if you select "NONE" as the commit policy.
    > Boot Order -(Defined in VPG Configuration, but displayed here)
    > Scripts - (Defined in VPG configuration, but displayed here)
    
  5. Review the summary, and when ready, click Start Move.
    During promotion of data, you cannot move a VM to another host.  If the host is rebooted
    during promotion, make sure the VRA on the host is running and communicating with the ZVM before 
    starting up the recovered VMs.

    zerto_perform_vpg_move_1_5

  6. Since we have selected a commit policy of “none”, once the migration is ready for completion, the Zerto UI will alert you letting you know there is a task awaiting input.  Click on the area highlighted below.zerto_perform_vpg_move_1_6_aSelect to either Commit (checkmark), or Rollback (undo Arrow):

    zerto_perform_vpg_move_1_6_b

  7. At this point, you can also choose whether or not to reverse-protect.  Make your selection and click Commit.zerto_perform_vpg_move_1_7_aThe task will update as seen below:zerto_perform_vpg_move_1_7_b

    Once you commit the move, the data in the protected site is then deleted, thus completing the migration.

Share This:

Zerto: Create a Virtual Protection Group (VPG)

This blog is the next step following the creation/deployment of the VRAs.

To begin protecting virtual machines, you will need to configure virtual protection groups (VPGs).  A virtual protection group is is an affinity grouping of VMs that make up an application.  VPGs can contain 1 or more virtual machines, and contain all the protection settings required which include:

  • Boot Order
  • re-IP settings for testing and recovery
  • Resource mappings
  • Offsite backup
  • Journaling
  • Re-protection settings

Once a VPG is configured, initial synchronization of the protected virtual machines begins to take place, and once synced, will continuously be protected.

Important:

When performing failover, ALL VMs in the VPG will be failed over, and you are not able to select 
specific VMs within the group to be recovered.

Tips

  • For granular protection and failover capabilities, VPGs can be set up containing single VMs, if your migration/failover plan requires being able to pick and choose systems to recover in an order you specify, when not all involved VMs need to be migrated or failed over.
  • Do not group ALL virtual machines into 1 VPG, as performing a recovery will attempt to recover everything contained within the VPG and in some cases, that’s not the best idea.
  • Whenever possible, group servers that depend on each other or make up an application together. This will allow you to make use of boot options, order, or delay to bring them up in the correct order. This will also prevent missing crucial application servers during recovery or migration.
  • Make use of the test feature for DR testing by setting up an isolated VLAN/portgroup which will allow live testing without impacting production.
  • Make use of the re-IP feature to automate any IP address change that needs to happen either on the test network or recovery network.

VPG Creation

  1. Log in to the Zerto UI
  2. Go to the VPGs tab, and click New VPG.create_vpg_1_2
  3. Specify a name for the VPG and set the priority, then click Next.
    In VPGs with different priorities, updates for the VPG(s) with the highest 
    priorities are transferred over the WAN before others.
    
    

    create_vpg_1_3

  4. Select the VM(s) you want to include in this VPG, press the right-arrow to move to selected VMs, then click Next.
    Using the search box in the "Available VMs" window will help you minimize the 
    number of VMs listed and focus only on the one(s) you're looking for.
    Zerto uses the SCSI protocol, so only VMs with disks that are configured/support 
    SCSI can be selected to be part of a VPG.
    
    

    create_vpg_1_4_a

    create_vpg_1_4_b

  5. Specify the recovery site and values to use for replication to the site, then click Next.create_vpg_1_5
  6. Specify the storage requirements for this VM and click Next.
    If you have pre-seeded the volumes, check the box beside the disks 
    and click the Edit Selected link.  Select Preseeded Volume, then browse to the VMDK 
    for that volume.  Repeat for any additional disks that you have pre-seeded.  This 
    is recommended if your VM is large, and has a high rate of change, or the WAN link 
    is shared and bandwidth is limited.

    create_vpg_1_6

  7. Specify the failover/move network (the newtwork that the recovered VM will run on), the recovery folder, any scripts, and click Next.
    Failover Test Network is optional, but recommended if you will be testing 
    failover prior to committing.
    

    create_vpg_1_7

  8. Enter the NIC details to use for the recovered VM, and click Next.
    In some cases, if you're replicating within the same vCenter or cluster, you 
    may end up with a duplicate MAC address warning when recovering, so to avoid this, you 
    can create a new MAC address on the recovery VM during recovery.  In any case, you 
    can also re-IP the VMs as part of the recovery procedure.  To view these 
    settings, check the box beside the VM(s) and click the Edit Selected link.

    create_vpg_1_8

  9. Select whether or not you want to create an offsite backup that can be stored for up to a year, then click Next.  If you don’t need to create a backup, leave this screen at the defaults, then click Next.
    For more information on backups with Zerto, refer to the help file 
    (click the ? button at the tope right of this window), or see the Zerto Virtual 
    Manager Administration Guide.

    create_vpg_1_9

  10. Review VPG settings summary, and if you don’t need to go back and make any changed, click Done.create_vpg_1_10

 

Share This:

Zerto: Deploy Virtual Replication Appliances

If you’ve followed along with Zerto: ZVM Installation, this entry is a continuation, and provides steps to deploying the Zerto Virtual Replication Appliances.

After installation has succeeded, open a browser, and connect to https://ZVMFQDN:9669/zvm.

Notes:

  • If this VM lives in a protected network for management/utility servers, you might need to allow port 9669 from your local network to the network the ZVM lives in.  The Zerto Standalone UI, vCenter Web Client, and vCenter C# client all use port 9669 to access the ZVM.
  • Be sure to use a supported browser.  Chrome, Firefox, and IE 11+ are recommended by Zerto.
  1. Log on using your vCenter credentials.zerto_vra_deploy_1_1
  2. Enter a license key and click Start.

After entering the license key and clicking start, you’re taken to the dashboard, however, before starting to protect VMs, the VRAs will need to be installed on the hosts in the site and pair the protected and recovery sites.

Install the VRAs

The Zerto installation includes the OVF template for VRAs.  A VRA must be installed on every host that manages protected VMs in the protected site, and on every host that will manage VMs in the recovery site.

The VRA compresses data that is passed across the WAN from the protected to recovery site, and automatically adjusts the compression level according to the CPU usage, totally disabling it if required.

A VRA can manage a maximum of 1500 volumes, whether they are protected or not.

VRA Requirements

Each VRA must have:

  • 12.5GB datastore space
  • at least 1GB of reserved memory
  • Each host installed to must be at least ESX/ESXi 4.0 U1 and have ports 22 and 443 enabled for the duration of the installation.

If you are installing to ESXi 5.5 or higher, the VRA should connect to the host with user credentials, otherwise, the password for the host root account is required.  Because of the method used when the VRA connects to the host using a VIB (ESXi 5.5 or higher), it is not necessary to enter the root password.

During VRA deployment, you should have IP addresses reserved, as it is not recommended to use DHCP; so be sure to also have the information for the subnet mask, and default gateway.

If you do not have SSH enabled on your hosts, the ZVM will attempt to enable and disable it during the installation of the VRA.

Important: Do not snapshot a VRA, as it will cause problems with replication!  I actually
forgot to exclude the VRAs from backups, and CommVault attempted to back them up after I had
configured my first VPG, and I ended up having to re-deploy the VRAs.  My advice is to create a
folder for the VRAs in your vCenter folder structure and have that folder excluded from backups
altogether.  Don't forget to move the VRAs into the folder as soon as they're deployed.

Installation

  1. Log in to the Zerto Manager UI
  2. Click on the Setup tab.zerto_vra_deploy_2_2
  3. Locate the host you want to deploy the VRA to, and check the box beside it.  Once you have selected the host, click New VRA.
    Note:  If you select multiple hosts, clicking the New VRA link
    will only install on the first host that you have selected.

    zerto_vra_deploy_2_3

  4. Specify the host, datastore, network, RAM, group, and enter the network details, then click Install.  Repeat the steps for each additional VRA you need to deploy (one per host).
    Note: When you deploy a VRA, Zerto will automatically reserve the amount of
    memory equal to what you specify in the VRA RAM settings.  This amount of RAM is the maximum buffer
    size for the VRA that is used to buffer IOs written by the protected virtual machines before the
    writes are sent over the network to the recovery VRA.  The recovery VRA also buffers incoming IOs
    until they are written to the journal.  If a buffer becomes full, a Bitmap Sync is performed after
    space is freed up in the buffer.
    The protecting VRA can use up to 90% of its buffer for IOs to send to the recovery VRA, which can
    use up to 75% of its buffer before it is full and requires a bitmap sync.

    zerto_vra_deploy_2_4

  5. After all VRA installations are completed, the setup tab will contain more information for each host that has a VRA installed.zerto_vra_deploy_2_5

Once you’ve completed these steps for each host requiring a VRA, you can create Virtual Protection Groups and start protecting your workloads.

Share This:

Zerto: ZVM Installation

Here we go!  The following procedure is a step-by-step installation of Zerto Virtual Replication 4.5 U3.  Before starting, you should have built 2 Windows VMs per the Zerto system requirements.  If this is being done in production, be sure to size the servers as needed for the number of VMs you will be protecting.

The version being installed is 4.5 U3.

System Requirements

Note: Be aware of OS limitations when dealing with 32-bit vs 64-bit.  In a 32-bit Windows Server installation, the maximum amount of RAM you can give the system (that it can actually use) is 4GB.  If you’re using Windows Server 2008 R2 or Windows Server 2012, they’re only available in 64-bit, so you won’t need to worry about this limitation.  For more information on Windows memory limitations, please see this.

Now we’ve got that out of the way, here are the system requirements for the Zerto Virtual Manager as of version 4.5 U3:

For the ZVM at Each Site

  • VMware vCenter 4.o U1 or later with at least 1 ESXi host
  • The account you log into the ZVM with and use to run the service will need to have administrative privileges in vCenter.
  • Supported Windows Operating Systems:
    • Windows Server 2003 SP2 or higher
    • Windows Server 2008
    • Windows Server 2008 R2
    • Windows Server 2012
    • Windows Server 2012 R2
  • Resource Reservations in vSphere
    • CPU: Reserve at least 2 vCPUs
    • Memory: Reserve at least 4GB
  • Resource Requirements for ZVMs:
    • Up to 750  protected VMs and up to 5 peer sites:
      • 2 vCPU, 4GB RAM
    • 751-2000 protected VMs and up to 15 peer sites:
      • 4 vCPU, 4GB RAM
    • > 2000 protected VMs and > 15 peer sites:
      • 8 vCPU, 8GB RAM
  • Time/NTP Requirements:
    • Zerto VMs must be synchronized with UTC (you can set actual timezones)
    • It is recommended to use an NTP server for clock synchronization.
  • Microsoft .NET Framework 4 (included with the Zerto installation package)
  • Storage: At least 2GB, plus 1.8GB if you need to install the .NET Framework

For the ZRA on Each Host

One VRA should be installed per host in a participating cluster.  By doing this, you are accounting for any vMotion or DRS activity related to any protected VM in the cluster(s).  ZRAs are deployed from within the Zerto UI, and furthermore, when this is done, DRS affinity rules are automatically created for the ZRAs, and any reservations required are automatically created.

Important:  After deployment of the ZRAs, be sure to add the ZVM and ZRAs into a folder in vCenter that can be excluded from any snapshots.  In otherwords, if you’re using VADP for backups, be sure to exclude this folder, or each ZRA/ZVM from within your backup software.  Failing to do so will cause corruption and you will have to re-deploy the ZRAs.  Furthermore, this will prevent any performance degradation that is a result of snapshot cleanup/consolidation jobs.

ZRAs require the following resources:

  • 12.5GB of datastore space (per ZRA)
  • At least 1GB RAM (reserved automatically through deployment process)
  • ESX/ESXi 4.0 U1 or higher
  • Ports 22 and 443 open on each host during installation of the ZRA (During ZRA deployment, Zerto will also attempt to enable the SSH service on each host, however, if it fails, you will need to manually enable/disable).
  • You’ll need to identify what datastore to install the ZRA to.
  • Static IP Address for each ZRA (recommended to use static)
    • IP Address (f0r each ZRA)
    • Subnet Mask
    • Default Gateway

ZRAs will automatically be named by Zerto during deployment, and clearly indicate what host they are running on.

Network Requirements

  • > 5MB/s is required for Zerto

ZVM Installation

Once you’ve built your Windows VMs to house the ZVM, the steps below will guide you through the installation.  This will need to be done in both sites, although, if you only have 1 site, you can still protect and recover within the same site.  Please note that if you are installing in 2 geographically separated sites, you may need to open some firewall ports before pairing sites and initiating replication.  For firewall requirements, see this document.

  1. Browse to the directory where you have downloaded the installation files to and run the installer (Zerto Virtual Replication VMware Installer).zerto_installation_files
  2. Click Next on the welcome screen.zerto_installation_1_2
  3.  Accept the License Agreement, and click Next.zerto_installation_1_3
  4. Select the installation directory, and click Next.zerto_installation_1_4
  5. Select the installation type, and click Next.zerto_installation_1_5
  6. Select either “Local System Account” or “This Account” if you have a dedicated service account.  Either way you decide to go, the account will require unrestricted access to the local resources on the ZVM.  After you made your selection, click Next.zerto_installation_1_6
  7. In the Database Type dialog box, select your database type, and click Next.Notes:  It is recommended to use an external SQL server when a site has more than 40 hosts that have VMs that need to be protected, and the site has more than 400 VMs that need to be protected.  If you use Windows Authentication for the SQL server (external), then the creadentials in step 6 will be used.zerto_installation_1_7
  8. Enter the name of the vCenter along with the admin credentials that will be used to connect, then click Next.zerto_installation_1_8
  9. Optional: If you have vCloud Director and want to protect it using Zerto Virtual Replication, enter the information necessary to connect to it, and click Next; otherwise, leave the “Enable vCD BC/DR” box unchecked, and click Next.zerto_installation_1_9
  10. Enter the Zerto Virtual Manager settings to identify this installation, and click Next.zerto_installation_1_10
  11. Enter the required information for ZVM communication, and click Next.  The ports listed below are defaults, and if you recover to a site managed by a Cloud Service Provider, be sure you do not change the default ports.zerto_installation_1_11
  12. As soon as you click Next on the screen in the previous step, the installer will auto-validate ZVM communication to ensure the ports to this ZVM are opened, it will verify vCenter credentials that you specified, and will register the vCenter plug-in.If the validation for each item results in “OK”, click Run, otherwise, resolve any errors, and click Recheck.zerto_installation_1_12
  13. If you clicked Run, Zerto Virtual Replication will begin installation and the configuration of components.zerto_installation_1_13
Share This:

Replace a Failed External PSC in Enhanced Linked Mode: Part 2

In Part 1 of “Replace a Failed External PSC in Enhanced Linked Mode”, I worked through repointing a Windows vCenter Server to another External PSC, in an effort to unregister, and rebuild a failed PSC.

In this guide, I will walk through:

  • Building a new Windows external PSC
  • Joining the SSO Domain
  • Re-pointing VC1 back to this newly-built, and linked external PSC, therefore, returning us to the original topology we started with (except for the server name)

The main systems I’ll be working with in this guide are:

  • VC1 (Windows vCenter that was re-pointed to working PSC for other site while we rebuild it’s home PSC)
  • PSC3 – This is a newly built Windows Server 2012 R2 PSC that is taking the place of PSC1.

I would have chosen to use the same name of the original PSC, but from experience, it’s always better to be safe, and not try to re-introduce a problematic record into the environment, just in case there are still entries hanging out somewhere in the working configuration.

At the end of this guide, we will end up with this topology:

replace_psc_part2_topology_1

 

Install the External PSC Role on the New Server


Note: 
These steps are the same ones to deploy the second PSC in “Deploying Windows vCenter with External PSCs in Enhanced Linked Mode: Part 2.”

  1. Launch the vCenter Server Installer
  2. Select vCenter Server for Windows, and click Install.
  3. Click Next on the welcome screen.
  4. Accept the EULA, click Next.
  5. Under External Deployment, select Platform Services Controller, and click Next.replace_psc_part2_1_5
  6. Verify the system name (This should be the same FQDN of the PSC you are building to replace the failed one), and click Next.
  7. Select Join a vCenter Single Sign-On domain.
  8. Enter the FQDN for the first Platform Services Controller that owns the SSO domain you want to join.
  9. Enter the vCenter Single Sign-On password, then click Next.replace_psc_part2_1_9
  10. When prompted for Certificate Validation, click OK to accept the self-signed certificate.
  11. Select Join an existing site, choose the site from the dropdown menu(should match the site name of the first PSC you created), and click Next.replace_psc_part2_1_11
  12. On the Configure Ports page, make any changes necessary for your environment, and click Next.
  13. Set the PSC installation and data directories, and click Next.
  14. Select whether or not to join the Customer Experience Improvement Program (CEIP), and click Next.
  15. Verify the installation summary settings, and if all looks well, click Install.
  16. Once the installation has completed, log into the vSphere Web Client, and navigate to Home > Administration > Deployment > System Configuration.  Under the Nodes object, verify that there are now 4 nodes (you should see 2 PSCs, and 2 vCenter servers).

 

Next Steps:

After verifying functionality of the newly added PSC, the next step is to re-point VC1 (repointed previously to PSC2) to the new PSC (PSC3).

replace_psc_part2_2_main

Repoint the Connections Between vCenter Server and Platform Services Controller:
 
  1. Log onto the vCenter Server instance (VC1).
  2. In the command prompt (run as administrator), navigate to C:\Program Files\VMware\vCenter Server\bin (or wherever you have vCenter installed to).
  3. Run the cmsso-util script:
    cmsso-util repoint --repoint-psc psc_fqdn_or_static_ip [--dc-port port_number]

    psc_fqdn_or_static_ip – is the FQDN or static IP address of the PSC you want to repoint to.

    replace_psc_part2_3_3

  4. Log into the vCenter Server instance by using the vSphere Web Client to verify that the vCenter server is running and can be managed.
  5. Finally, to see what PSC each vCenter is connected to:
    1. Log into the vSphere Web Client, and navigate to Hosts and Clusters View.
    2. Select a vCenter, and go to the Manage Tab.
    3. In Settings, go to Advanced Settings.
    4. Search for the config.vpxd.sso.admin.uri
    5. When the result is returned, look at the Value field, and this will tell you what PSC the particular vCenter is connected to.replace_psc_part2_3_5

 

This completes the series for Replacing a Failed External PSC in Enhanced Linked Mode. If you find that this has helped you, please feel free to share the information. It took me quite a while to gather all the information needed, and build the environment for this, so I really hope it helps.

In my research when first encountering the issues with my failed PSC, I found that there are a lot of other bloggers out there who have written something about the issues, troubleshooting steps, and fixes related to a failed PSC. While this is not a “one fix to rule them all” solution, it is a very clean way to replace a failed PSC. I apologize for not documenting the same thing for the VCSA, however, if you follow the steps in the order I provided, the links I have in my posts also have the proper steps to execute for the VCSA.

 

Share This:

Replace a Failed External PSC in Enhanced Linked Mode: Part 1

If you’ve followed along starting with the 2-part series about “Deploying Windows vCenter with External PSCs in Enhanced Linked Mode…”, this is the next installment, in which we will go through replacing a failed external PSC using that same topology.

If you haven’t followed along, then the following links will help set the stage for what we’re doing here:

The reasons I’m performing these steps, or even trying it, are:
  • I have never performed this procedure before.
  • I would like to know how to do this and be able to help others out by sharing my experience and the information.
  • I would like to understand the requirements and general order of operations for performing the procedure.
  • I would like to address a current production issue without experimenting in production, and therefore causing further complications or a causing a complete loss of manageability of my environment.

 

The Workflow

replace_psc_part1_topology

 

Lab Testing

 

So to prepare for this in the lab, I’ve created a datacenter in each vCenter, assigned some permissions to both of them (AD-integrated permissions), and licensed both vCenters (licenses to be removed following the testing).  This was done to ensure that replication was succeeding between the two PSCs.  I also ran the vdcrepadmin.exe utility on both PSCs to ensure replication is succeeding, and there are no outstanding changes or replication problems.

From PSC1:

.\vdcrepadmin.exe -f showpartnerstatus -h localhost -u administrator -w [password]

replace_psc_part1_lab_1_1
From PSC2:
.\vdcrepadmin.exe -f showpartnerstatus -h localhost -u administrator -w [password]
replace_psc_part1_lab_1_2
The next step for me to do is shutdown PSC1 to simulate a failure.  Once shut down, re-point the vCenter affected by this failure to the working PSC:
replace_psc_part1_lab_1_3

If I run the following on PSC2, it will show that PSC1 is now offline:

.\vdcrepadmin -f showpartnerstatus -h localhost -u administrator -w [password]

If I run the following on PSC2, it still shows that there are two PSCs registered:

.\vdcrepadmin -f showservers -h localhost -u administrator -w [password]

replace_psc_part1_lab_1_4

Re-point the Connections Between vCenter Server and Platform Services Controller

  1. Log onto the vCenter Server instance (VC that is still connected to the failed PSC).
  2. In the command prompt (run as administrator), navigate to C:\Program Files\VMware\vCenter Server\bin (or wherever you have vCenter installed to).
  3. Run the cmsso-util script to repoint the connection of this vCenter to the PSC that is still alive:
    cmsso-util repoint --repoint-psc psc_fqdn_or_static_ip

    (Running the command above may take some time to complete. In my test lab, it took approximately 13 minutes to complete)

    replace_psc_part1_lab_3_3

    Part of the repointing task includes stopping and starting all vCenter related services on the server.  Give the web server additional time to fully initialize before moving on with the next step.

  4. Log into the vCenter Server instance by using the vSphere Web Client to verify that the vCenter server is running and can be managed.After the web server completed its initialization for VC1, I was able to log in successfully, and verify the inventory, permissions, and licensing.  The next step is to Unregister the bad PSC (PSC1) from the configuration on PSC2.

 

Unregister the Failed PSC

 

  1. On the PSC (live one that you just repointed to), open a command prompt (run as administrator).
  2. Browse to C:\ProgramData\VMware\vCenterServer\cfg\install-defaults.
  3. On the failed PSC: Open the vmdir.ldu-guid file to find the hostid.
  4. On the working PSC: Navigate to C:\Program Files\VMware\vCenter Server\bin
  5. On the working PSC: Run the cmsso-util unregister command to unregister the stopped/failed Platform Services Controller:

    cmsso-util unregister --hostId host_Id --node-pnid Platform_Services_Controller_System_Name --username administrator@vsphere.local --passwd [password]

    replace_psc_part1_lab_4_5

    After this has been run successfully, verify that the OLD PSC has been removed from the topology.

  6. n the vSphere Web Client, navigate to Home > Administration > Deployment > System Configuration.  Under the Nodes object, verify that there are only 3 nodes (you should see 1 PSC, and 2 vCenter servers).
  7. On PSC2, run

    .\vdcrepadmin -f showservers -h localhost -u administrator -w ************

    replace_psc_part1_lab_4_7

    You should now only see one server in the listing, as opposed to 2, since you just removed the failed PSC.

  8. Delete the failed PSC (VM) you no longer need from the vSphere inventory.

 

There you have it.  We have successfully re-pointed a vCenter to another PSC, unregistered the bad PSC, and validated that we are now ready to rebuild in order to re-instate the original topology.  The next part to this series will cover building the replacement PSC, joining the SSO domain, and finally, repointing the vCenter at this new PSC, therefore returning the topology to where it was before we started.

Share This:

Deploying Windows vCenter with External PSCs in Enhanced Linked Mode: Part 2

Before following any steps in here, be sure to refer to the previous part to this series, which will provide some background information and walk through the steps to install the first PSC and vCenter servers.  The steps contained in this post will be a continuation, marking the differences between the initial install (previous post), and the additional PSC and vCenter server.
In this post, I will be walking through joining the second PSC to the SSO domain created previously.  Following that, if there are any steps different for the subsequent vCenter Server installation, I will call out any steps in this installation that differ from the first install and include screenshots.
For any pre-requisites, head to Part 1 of this series.

 

Install the Second Platform Services Controller on a Windows Machine and Joining the SSO Domain

If deploying in a production environment, refer to the vSphere Installation and Setup for vSphere 6.0 Guide.

    1. Launch the vCenter Server Installer
    2. Select vCenter Server for Windows, and click Install.
    3. Click Next on the welcome screen.
    4. Accept the EULA, click Next.
    5. Under External Deployment, select Platform Services Controller, and click Next.external_psc_part2_1_5
    6. Verify the system name (recommended you use FQDN, not IP Address), and click Next.
    7. Select Join a vCenter Single Sign-On domain.
    8. Enter the FQDN for the first Platform Services Controller that owns the SSO domain you want to join.
    9. Enter the vCenter Single Sign-On password, then click Next.external_psc_part2_1_9
    10. When prompted for Certificate Validation, click OK to accept the self-signed certificate.
    11. Select Join an existing site, choose the site from the drop-down menu (should match the site name of the first PSC you created), and click Next.external_psc_part2_1_11
    12. On the Configure Ports page, make any changes necessary for your environment, and click Next.
    13. Set the PSC installation and data directories, and click Next.
    14. Select whether or not to join the Customer Experience Improvement Program (CEIP), and click Next.
    15. Verify the installation summary settings, and if all looks well, click Install.

Next Steps

 
Once completed, run the installer on the vCenter Server that will connect to this PSC, and be sure to connect it to THIS PSC, NOT THE FIRST PSC that was built.

external_psc_part2_1_ns_1

Install vCenter Server and the vCenter Server Components, Connecting to the Second PSC

Perform these steps, making sure to connect this vCenter to the PSC that was just installed above, not the first PSC from Part 1.
Installation
 
  1. Launch the vCenter installer and select vCenter Server for Windows. Click Install.
  2. Once the installer initializes, click Next.
  3. Accept the VMware End User License Agreement, and click Next.
  4. Under External Deployment, select vCenter Server and click Next.
  5. Enter the system’s FQDN, and click Next.
  6. Enter the information for the external PSC that was deployed in the section above, and click Next.  This step will register the vCenter with the PSC.
  7. When prompted for certificate validation, click OK to approve the self-signed certificate created by the PSC.
  8. Configure the vCenter service account according to your environment requirements and click Next.  If you are using an external database server, you will need to specify a user service account.

    Note: 
    If you are using a user service account, you will need to make sure it has the “log on as a service” privilege in the local security policy.

  9. Select your database deployment and enter information if necessary, then click Next.
  10. Configure the required ports if necessary to match your environment, and click Next.
  11. Configure the installation directory for the vCenter Server and data, then click Next.
  12. Review all settings, and when ready, click Install.
Next Steps
 
After you’ve completed this setup, you should now have a functional topology, with 2 vCenters each connected to an external platform services controller.  The two external PSCs should now also be running in enhanced linked mode, and to verify, you can log into either vCenter, and see that you can also manage the inventory of the other.
I will follow this up with my testing for vCenter repointing and recovering from a failed PSC in this topology.
Share This:

Deploying Windows vCenter with External PSCs in Enhanced Linked Mode: Part 1

In many cases for small environments, it makes sense to deploy a vCenter Appliance with an embedded Platform Services Controller.  In larger environments with multiple sites, although you can (doesn’t mean you should) manage remote hosts with a single vCenter, it may make more sense to deploy a vCenter in each site, or region to further increase availability.  Furthermore, by configuring Enhanced Linked Mode, you can make management simpler by being able to link vCenter systems and replicate roles, permissions, licenses, policies, and tags; instead of having vCenter servers which are all managed individually.

Enhanced Linked Mode not only allows you to connect Windows vCenters, or just vCenter Server appliances; you can have an environment where Windows vCenters as well as vCenter Appliances can be linked together.  That said – there are supported and unsupported topologies, which is beyond the scope of this, however, you can find the information for those topologies in this VMware KB Article.

The following procedure is my re-creation of our production topology in a test environment.  The steps will walk through standard installation and configuration of the vCenter servers and PSCs in Enhanced Linked mode to get a to a usable state.  The follow-up post to this will be related to Installing the second PSC and vCenter, linking to the SSO domain and site created in this how-to.

Disclaimer:  The following information and procedures are performed in an isolated environment, and I am building these with the absolute minimum requirements since my test environment is a single physical host that also hosts other test machines in use by another engineer.  If performing these steps in production, be sure to follow proper sizing best practices, and built to suit your particular environment using the resource links provided throughout the article.

Hardware Requirements: vCenter Server for Windows

Before we get into the installation procedure, here’s an idea of what I’m building and the topology I will be using.

(All servers listed below are running Windows Server 2012 R2)

  • 1 Active Directory Domain Controller
  • 1 Single Sign-On Domain
  • 2 Sites (simulated)
  • 2 External Platform Services Controllers (one for each site)
  • 2 vCenter Servers (one for each site)

external_psc_diagram_1

 

Install a Platform Services Controller on a Windows Host

 

If deploying in a production environment, refer to the vSphere Installation and Setup for vSphere 6.0 Guide.
Pre-requisites
  • Verify system meets the minimum HW/SW requirements
  • Download the vCenter Server Installer
  • Install Adobe Flash Player version 11.9 or later if you will be running using the vSphere Web Client from one of the host machines.
  • Forward and Reverse DNS entries should be created for each system prior to installation.

 

Installation

  1. Launch the vCenter installer and select vCenter Server for Windows. Click Install.

    external_psc_part1_1_1

  2. Once the installer initializes, click Next.
  3. Accept the VMware End User License Agreement, and click Next.

    external_psc_part1_1_3

  4. Under External Deployment, select Platform Services Controller and click Next.

    external_psc_part1_1_4

  5. Enter the system’s FQDN, and click Next.

    external_psc_part1_1_5

  6. If this is the first PSC being created, create a new Single Sign-On Domain, set the password, provide the site name, and click Next.

    external_psc_part1_1_6

  7. Configure the ports as needed (I left this at the defaults, as I have no need to change them), then click Next.

    external_psc_part1_1_7

  8. Change the installation and data directories as needed and click Next.  For my implementation, I kept the defaults, since this is not production.

    external_psc_part1_1_8

  9. Select whether or not you want to join the customer experience improvement program (CEIP), then click Next. Since this is an isolated environment with no external internet access, I unchecked it.

    external_psc_part1_1_9

  10. At the Ready to Install screen, verify settings, and if everything looks good, click Install.  When done, the installer will provide you with next steps (convenient!)

    external_psc_part1_1_10

 

Next Steps
 
Note: You must wait for the PSC installation to complete before moving to the next step.  VMware does not support concurrent installations of PSC and vCenter.

 

Install vCenter and the vCenter Components

 

  1. Launch the vCenter installer and select vCenter Server for Windows, then click Install.

    external_psc_part1_2_1

  2. Once the installer initializes, click Next.
  3. Accept the VMware End User License Agreement, and click Next.

    external_psc_part1_2_3

  4. Under External Deployment, select vCenter Server and click Next.

    external_psc_part1_2_4

  5. Enter the system’s FQDN, and click Next.

    external_psc_part1_2_5

  6. Enter the information for the external PSC that was deployed in the section above, and click Next.  This step will register the vCenter with the PSC.

    external_psc_part1_2_6

  7. When prompted for certificate validation, click OK to approve the self-signed certificate created by the PSC

    external_psc_part1_2_7

  8. Configure the vCenter service account according to your environment requirements and click Next.  If you are using an external database server, you will need to specify a user service account.

    Note: If you are using a user service account, you will need to make sure it has the “log on as a service” privilege in the local security policy.

    external_psc_part1_2_8

  9. Select your database deployment and enter information if necessary, then click Next.

    external_psc_part1_2_9

  10. Configure the required ports if necessary to match your environment, and click Next.

    external_psc_part1_2_10

  11. Configure the installation directory for the vCenter Server and data, then click Next.

    external_psc_part1_2_11

  12. Review all settings, and when ready, click Install.

    external_psc_part1_2_12

 

Next Steps
  • Configure vCenter to integrate with your Active Domain or LDAP server as needed.
  • Configure any group memberships, roles, permissions, licenses, etc…
  • Install the next PSC and be sure to join the existing Single Sign-On domain that you just created.  Following that, install the next vCenter, and you are done!
Share This: