Skip to content

Tips for VCP5-DCV Exam Preparation

I recently submitted a white paper on Tips for VCP5-DCV Preparation that contains a lot of details that I expect VCP candidates will find useful, such as:

  • Benefits and requirements of VCP5-DCV
  • Differences in the VCP510 and VCP550 exams
  • General preparation tips
  • URLs to unofficial practice exams
  • Link to Measure-Up, the official source for practice exams
  • Special considerations for the new VCP550 exam
  • Details on about 50 items (objectives, skills, knowledge) that  deserve special attention
  • Information on related, official VMware courses

If you are preparing for the VCP5-DCV exam, please take a look and provide feedback.

 

 

Proposal for Providing AD and DNS for SRM Test Failover

This is an idea that I have implemented for DR test purposes for a couple of customers.  I have not seen this idea documented anywhere in the community, so I thought I would post it here for discussion.

Scenario:  

For some of my VMware SRM customers, the network allows production VLANs and subnets to be extended to the recovery site, which allows us to keep the IP settings of VMs during a planned migration or disaster recovery.   This certainly simplifies our DR planning.  For example, DNS records do not have to be updated during a disaster recovery or planned migration, because the IP addresses of the VMs will not be changed.  We can run a set of Active Directory (AD) controllers and Domain Name Service (DNS) servers at the recovery site, where they stay synchronized with their counterparts that run at the protected site. This means that current AD and DNS data is available at the recovery site and that SRM does not have to recover AD and DNS during a real disaster recovery.

However, some challenges may exist while performing non-disruptive test recoveries in this scenario, which typically requires isolated test network networks.  The first concern is whether or not the IP settings for VMs must be changed during non-disruptive tests, to allow the original VMs and applications to continue to run undisturbed.    Preferably, the test networks will allow us to keep the original IP settings of  the VMs without being visible to the production network, where the IP addresses are currently in use.   If this can be achieved, then the next issue is how to provide the required AD controllers and DNS servers for test purposes.  Ideally, the AD controllers and DNS servers would provide current data (current at the moment the test began), would run in the test network with no concern that they can be seen by the production network, and would be easily removed after the test is complete.

Proposal:

To facilitate non-disruptive testing, we could use vSphere Replication (VR), a recovery plan, and the Test Recovery option in SRM to failover AD, DNS, and other required infrastructure severs to the test network, prior to using Test Recovery to failover any other test plan. This ensures that the test network has access to recently updated AD, DNS and other infrastructure servers, which are isolated from production networks during the test, but are available to the VMs being tested. These infrastructure servers can harmlessly be modified in any manner during the test period. After the test is complete, the Cleanup operation in SRM can be used to clean up the VMs involved both recovery plans.

NOTE-1:

The infrastructure (AD, DNS, DHCP, etc) VMs should Not be recovered when performing actual Disaster Recovery migrations. Instead, peers of these servers, which are kept consistent via the application (such as AD synchronization), should reside at the recovery site.

NOTE-2:

In some cases, the state of the AD controller could be inconsistent and produce errors when it is brought up at the recovery site during a Test Recovery operation. This could occur if the AD controller was in the midst of an AD synchronization when the VR replication occurred. In this rare case, just use the Cleanup operation and repeat the Test Recovery operation.

NOTE-3:

This proposal should only be implemented if the Test Network is completely isolated from the Production Network. It is acceptable that the test network be comprised of multiple networks (VLAN / subnets) that communicate with each other, as long as none of these networks can communicate with any production network.

 Note-4:  In most cases, only a single AD controller should be included in the test recovery.  VR does not quiesce multiple VMs simultaneously, even if they are part of the same protection group, so two or more domain controllers may not be in sync if recovered together.

Sample Patching Policy – VMware Update Manager – ESXi Hosts

I  typically recommend that administrators establish a policy for using VMware Update Manager to patch and update their ESXi hosts.  Frequently, I help them write such a policy.  The policy tends to vary greatly from one environment to the next.  Sometimes, it varies from one ESXi cluster to the next within a single environment.  The policy depends on many factors.  Several of my customers are required to install new operating system patches (including ESXi patches) within 14 days of their release.  Several of my larger customers have one or more clusters dedicated to development and test, where they are free to immediately install and test new patches without concern of impacting production services.  Other customers only have three or fours ESXi hosts, which are all running critical VMs.  Some customers are very concerned about patching aggressively due to fear of vulnerabilities.  Some customers have little interest in patching and they ask, “If it works, why risk breaking it”?  Some customers seldom patch, except immediately after installing an update or performing an upgrade.

Typically, my goal is to help the customer create a Patch Policy that well suits them and to help them develop the specific procedure for implementing the policy.  Here is sample of a policy and procedure that I recently helped develop for a customer.  The customer uses two vSphere clusters to run an application, whose SLA requires 99.99% plus availability.  The application utilizes active and passive sets of virtual machines. The Active set of VMs run in Cluster-A and the Passive set of VMs run in Cluster-B.  The administrators can instantly fail the application from Cluster-A to Cluster-B using a simple user interface provided by the application.  They visualize vSphere simply as a solid, resilient platform to run this application.  They make very few changes to the environment.  They are very concerned that changing anything may disrupt the application or introduce new risk. Each cluster is composed of multiple blades and blade chassis.

In this particular use case, we developed the following policy and procedure:

  • Policy:  Plan to patch once per quarter and only install any missing Critical patches that are at least 30 days old.  Initially, apply new patches to a single ESXi host in the B Cluster.  The next day, apply new patches to second host in the same chassis.  The third day, apply the new patches to the remaining hosts in the chassis.  On the fourth day, apply the patches to the remaining hosts in the entire cluster.  On the following day apply the new patches to all the hosts in one chassis in the Cluster A.  On the final day, apply the new patches to the remaining hosts in Cluster A.
  • Procedure:
    • Download all available patches from VMware’s website and manually copy the zip file to a location that is accessible from the vCenter Server.
    • Use the Import Patches link on the Update Manger configuration tab to import all patches from the zip file.
    • Create a new Dynamic baseline.  Set the Severity to Critical, check On or Before, and the Release Date to the specific date that is 30 days prior to the current date.
    • Attach the Baseline to Cluster B and Scan the entire cluster for compliance with the baseline.
    • Select one non-compliant ESXi host to upgrade first.  Select Enter Maintenance Mode on that host.
    • Edit the DRS Settings in the Cluster and change the Automation Level to Manual.
    • Remediate the host to install the missing patches.
    • Restart the host.  Examine its Events and logs and verify no issues exist.
    • Migrate a single, non-critical VM to the host.  Test various administration functions, such as console interaction, power on, and vMotion.
    • Select the cluster and the DRS tab.  Use the Run DRS to generate recommendations immediately.  If any recommendations appear, use the Apply Recommendations button to start the migrations.
    • Following the order and schedule that is established in the policy, continue upgrading the remaining hosts in Cluster B.
    • After all hosts in Cluster B are patched, then change the DRS Automation back to Fully Automated
    •  Update Cluster A by applying the previous steps.

VMware SRM Custom Install – Shared Recovery Site

In a few of my customer’s VMware Site Recovery Manager (SRM) implementations, we needed to configure a single recovery site to support two protected sites.  SRM does permit this, but it requires a custom installation.  Early in my SRM  engagements, I take steps to determine if a shared recovery site is needed or may be needed.  In either case, I perform the custom installation that permits the shared recovery site.  Here are a few keys to configuring a shared recovery site in SRM.

Planning

The first key is planning.  The main difference in planning for a shared recovery site versus a standard recovery site is that SRM must be installed twice at the recovery site (once for each protected site).  SRM must be installed into separate Windows servers at the shared recovery site.  These two SRM instances represent a single site, but will have unique SRM-IDs.   The SRM-ID can be thought of as the name of an SRM instance at the shared recovery site.  A common convention is to set each SRM-ID value to a string that combines the recovery site name and the site that the instance it protects.

For example, consider a case where the shared recovery site is called Dallas, which protects two sites called Denver and Seattle.   At the Dallas site, SRM must be installed in two Windows servers.  One SRM instance will be used to protect the Denver site and the other instance will protect the Seattle site.  In this case, a sensible choice for the SRM-IDs may be DAL-DEN and DAL-SEA.

Custom Installation

The second key is to perform the custom installation.  To perform the custom installation:

  • Using one of the Windows servers at the shared recovery site, where an SRM instance will be implemented to protect one specific site, download the SRM installer.
  • In a command prompt, change the default directory to the location of the installer.
  • Run this command to launch the wizard for the custom installer:VMware-srm-5.1.1-1082082.exe /V”CUSTOM_SETUP=1”
  • The custom installation wizard should look much like the standard installation wizard, except that it includes some extra pages and options.  The first additional page is the VMware SRM Plugin Identifier page.   On this page, select Custom_SRM Plugin Identifier.
  • The second additional page prompts the user to provide the SRM ID, Organization, and Description.  The critical value to provide is the SRM_ID, which should be set to the value that is planned for one of the SRM instances at the shared recovery site.  (For example, DAL-DEN).
  • The remainder of the installation process is identical to a standard installation.  Be sure to repeat these steps for the second SRM instance at the shared recovery site.

Connecting the Protected Site to the Correct SRM Instance

The third key is to connect each protected site to the correct SRM instance at the shared recovery site.  The main difference in connecting each protected site to shared recovery site versus connecting to a standard recovery site is to select the appropriate SRM-ID.  Begin by performing the typical steps to connect the protected site to the recovery site, which requires using the SRM plugin for the vSphere Client to select the first protected site and click Configure Connection.  In the Connection wizard, an extra page will appear after selecting the vCenter Server at the recovery site.  The extra page identifies the two SRM instances at the recovery site by displaying a list providing the SRM-ID, Organization, and Description of each SRM instance.  Choose the SRM-ID that corresponds to the SRM instance that should be used to protect the first site.   Naturally, this process should be repeated for the second protected site.

New Exam for VCP5-Cloud Certification – VCPC550

VMware just released a new exam that can be used to qualify for VCP5-Cloud.  It is the VCPC550 exam.   It covers vCloud Director 5.5 and vCloud Automation Center (vCAC) 5.2.  Previously, VCP5-Cloud candidates had to pass the VCPC510 exam, which covers vCloud Director 5.1.  Now, candidates for VCP5-Cloud have a choice.  They can  pass either the new VCPC550 exam or the original VCPC510 exam.  In either case, the candidate will earn the same certification, VCP5-Cloud.

For details on the certification, such as exam choices and blueprint, see the VCP-Cloud Certification webpage.

Welcome to vLoreBlog 2.0!

After eighteen months of dedicating this blog to students to provide supplemental data and certification preparation advice, I decided to expand its goals.  This year, I plan to include categories of articles related to professional services focused on VMware.

In addition to having years of experience teaching official VMware training classes, I also have years of experience in delivering professional services on vSphere, View, vCloud and other VMware technologies.  In addition to being a VMware Certified Instructor (VCI) Level 2, I also am a VMware Certified Professional (VCP) on datacenter virtualization (VCD-DCV), desktop (VCP-DT) and cloud (VCP-Cloud).  I am also a VMware Certified Advanced Professional (VCAP) on datacenter design (VCAP-DCD), desktop design (VCAP-DTD) ,  cloud design (VCAP-CID), and datacenter administration (VCAP-DCA).   In addition to providing training via the VMware Authorized Training Center (VATC) program, I also provide professional services via the VMware Authorized Consultant (VAC) program.   VMware utilizes me, as well as other authorized consulting partners, to deliver professional services in the field to their customers.   VMware works hard to ensure that any sub-contrator delivering their professional services are as capable as their own engineers.  In my case, I have worked longer at delivering VMware focused professional services than most of engineers, who are directly employed by VMware.   I have trained several of these engineers.

I plan to begin using vLoreBlog to share advice and experience related to my professional services delivers.  This will include areas such as:

  • New features, products, and announcements from VMware
  • 3rd party products and technologies
  • Architect and Design advice
  • Real field examples including challenges, decision justification, and gotchas

The articles will concentrate on details from my actual experience with the intent of providing details that may be lacking in the community.

Today, I posted my first article under the new category:  Professional Services Tips.

I hope you find it useful.

Custom vCenter Server Alarms and Actions

As part of many of my vSphere related professional services engagements, such jumpstarts, designs, upgrades and health-checks, I typically address the alarms provided by VMware vCenter Server.  Frequently, I recommend creating some custom alarms and configuring specific actions on some alarms to meet customer needs.  Although my recommendations are unique for each customer, they tend to have many similarities.  Here I am proving a sample of the recommendations that I provided to a customer in Los Angeles, whose major focus is to ensure high availability.  In this scenario, the customer does not use an SNMP management system, so we decided to use the option to send emails to the administration team,  instead of sending SNMP traps.  Also, in this scenario, the customer planned to configure Storage DRS in Manual mode, instead of Automatic mode.

vCenter Alarms and Email Notifications

Configure the Actions for the following pre-defined alarms to send email notifications.  I consider each of these alarms to be unexpected and worthy of immediate attention if they trigger in this specific vSphere environment.  Unless otherwise stated, configure the Action to occur only when the alarm changes to the Red state.

  • Host connection and power state (alerts if host connection state = “not responding” and host power state is NOT = Standby)
  • Host battery status
  • Host error
  • Host hardware fan status
  • Host hardware power status (HW Health tab indicates UpperCriticalThreshold = 675 Watts, UpperThresholdFatal=702 Watts)
  • Host hardware system board status
  • Host hardware temperature status
  • Host hardware voltage
  • Status of other host hardware object
  • vSphere HA host status
  • Cannot find vSphere master agent
  • vSphere HA failover in progress
  • vSphere HA virtual machine failover failed
  • Insufficient vSphere HA failover resources
  • Storage DRS Recommendation (if the decision is made to configure Storage DRS in a Manual Mode)
  • Datastore cluster is out of space
  • Datastore usage on disk (Red state is triggered at 85% usage)
  • Cannot connect to storage  (triggered if host loses connectivity to a storage device)
  • Network uplink redundancy degraded
  • Network uplink redundancy lost
  • Cannot connect to storage  (triggered if host loses connectivity to a storage device)
  • Health status monitoring (triggers if changes occur to overall vCenter Service status)
  • Virtual Machine Consolidation Needed status (triggered if a Delete Snapshot task failed for a VM)

Consider creating these custom alarms on the folders where critical VMs.  Optionally, define email actions on some of these.

  • Datastore Disk Provisioned (%)   (set yellow trigger to 100%, where the provisioned disk space meets or exceeds the capacity.)
  • VM Snapshot size (set to trigger at 2 GB)
  • VM Max Total Disk Latency  (set trigger at 20 ms for 1 minute)
  • VM CPU Ready Time – assign these to individual VMs or folders, depending on the number of vCPUs (total virtual cores) assigned to each VM