VMware ESX has a feature called HA which basically restarts downed Virtual Machines after a VMWare physical host has gone down due to a hardware failure.
While in the industry HA, at least in the UNIX world, means assurances to a certain absolute degree of operational continuity during a given measurement period…. regretfully in VMware if it is used…it means you have experienced down time!
But what if we could some how use VMware’s vmotion to move the running virtual machines to a different physical host long before the hardware actually failed?
What if you could find a piece of software that extended the use of VMWare….that was free….and where you could have direct manufacture support 24 hours a day 365 days a year?
What if we could spec that the software must have these features:
- A Single console to manage both physical and virtual systems
- Administer multiple virtualization technologies from a single console
- Easy to install and use
- Proactive virtual machine migration using VMotion based on predictive failure alerting from power supplies, CPUs, Memory, Disk Drives, Voltage Regualtion Modules, PCI slots, Fans, etc
- Drive VMware VMotion using physical hardware status information
- in the event of an actual failure, server must near instantaneously notify the manufacture of the failure so repair parts and assistance can be
- A built-in topology view to help with understanding of resource linkages
- Integration with an enterprise maintenance console to increase serviceability by migration of virtual machines to a standby server during a service window
By using the IBM Director Extension Virtualization Manager you can have all of this.
Imagine this scenario:
You have 5 VMware physical hosts each supporting 30 virtual servers. Your hardware is now 2.5 years old as you deployed your VM infrastructure beginning back in 06. You are running IBM Director and Virtual Center Server. IBM Director is monitoring the IBM BMC or RSA components of your 5 physical hosts. You also have deployed the free Service and Support Manager plugin of IBM Director. On one of the hosts the voltage regulation modules for CPU01 begins accumulating counts at an increasing rate as monitored by the hardware’s BMC. The hardware’s BMC determines at 5:00 am on Wednesday that a failure is eminent and flags the VRM with a predictive failure alert. The predictive failure alert causes 2 action plans to occur:
- The IBM Service and Support Manager plugin to IBM Director receives the PFA and sends a notification to IBM hardware support in Atlanta Ga. The engineering staff there receives the alert, validates that the server is under maintenance, calls your corporate help desk to validate the ship to address and to schedule a IBM CE on site if needed. Your help desk validates the PFA alert on the server and send you an email letting you know that a new IBM component will be on site after you finish your first coffee of the day.
- The IBM Virtualization Manager extension to IBM Director receives the PFS simultaneously and notifies VMware that the physical host is going into maintenance mode. Using predefined actions all running virtual machines residing on the physical host with the failing VRM are live migrated (vmotioned) off to one of the 4 remaining physical vmware hosts thereby assuring zero down time for your virtual server farm. After the failed component is replaced and the server is taken out of maintenance mode all domiciled virtual machines will be live migrated (vmotioned) back to the repaired server
The alternative to these automated actions steps include down physical servers, down virtual servers, calls from internal and external customers, missed coffee, middle of the night wake up calls, missed SLAs, frustration, anger, etc.
AND……….
These extensions are FREE!

No comments yet
Comments feed for this article