1. Implementing VMware vSphere App High Availability (HA). As part of vSphere 5.5 HA, VMware has added vSphere App HA. The vSphere App HA is not familiar to most VMware vSphere HA users, but it soon will be. VMware didn't provide native application monitoring with previous vSphere HA versions. It could be done with either third-party software or using APIs. But there was nothing canned and available from VMware. VMware's Hyperic acquisition led to vSphere App HA. It monitors mission-critical application services and restarts them automatically if they get wobbly or go away. It requires running a minimum of two virtual appliances in two separate host nodes in the cluster. But the benefits are huge when keeping mission-critical applications -- such as Microsoft SQL Server, Microsoft SharePoint, Microsoft IIS, Apache HTTP Server, or VMware vFabric tc Server -- up and running at all times, even when the hardware didn't fail.
2. Hardware component quality. Just because capabilities such as VSAN allow off-the-shelf components, don't think all of them are made the same. This is especially true with storage drives, both of the hard disk (HDD) and flash varieties. Running with desktop-class HDDs or low-quality flash (low-end MLC or even TLC) is a recipe for frustrating problems and headaches. Just because cheap components can be utilized does not mean they should be. Any upfront savings from the use of cheap hardware componentswill be more than blown away by the sweat equity of troubleshooting. Be smart and use validated and certified supported hardware components from trusted vendors that won't make a hash of things.
3. Read the VMware HA administrator's guide and best practices up front before implementing. Become familiar with those best practices and troubleshooting tips. Remember, VMware professionals who have seen and fixed the vast majority of user HA problems compiled these documents. They will have covered the most common day-to-day user errors.
4. Clustered host hardware should be essentially identical. Be sure VMware vCenter is set up for HA; switches are set up for multi-pathing, trunking and PortFast; and all clustered hosts have identical VSAN flash caching. The flash utilized must deliver the capacity, IOPS, throughput and latency required to meet the virtual machine's (VMs) and application requirements. Also, be sure there are more than enough resources available to cover one or more failed hosts in the cluster without unduly reducing the performance of mission-critical applications.
5. Test frequently. Set up a representative test plan. Test at least once a quarter. Before each test, make sure the VMware vSphere environment has not changed. If it has changed, update the test to reflect the real environment. Evaluate the results of each test; fix problems, issues and holes in the plan; and update the test plan.
VMware HA best practices include failover
How to achieve load balance and failover on VMware ESX
Has VMware HA made failover clustering obsolete?
Dig Deeper on Disaster recovery facilities - operations
Related Q&A from Marc Staimer
Network File System and Common Internet File System/Server Message Block were designed to work with any operating system, but NFS remains dominant in... Continue Reading
Object storage has unique features, including erasure coding and multi-copy mirroring, which may make it better suited to data protection than more ... Continue Reading
Why would you attach NAND flash storage directly to the memory channel? Isn't RAM much faster than NAND? Marc Staimer discusses this and more in this... Continue Reading