This article is coming out of a discussion which I had around a month ago with one of my colleague and then with a customer about the Capacity Management feature of vCenter Operations Manager (vCOPS).
I believe this topic is worth writing about as this might help you understand how Capacity IQ which is now rolled up into vCenter Operations Manager (vCOPS), calculates the usage of resources for a Virtual Machine or for that matter any object in the vCenter. The resource usage ultimately helps the tool to monitor the Capacity Utilization over a period of time. This Capacity utilization leads to calculation of 2 minor badges:-
a) Reclaimable Waste, and
These two values then roll up into a Main Badge known as “EFFICIENCY“. A score of 100 on efficiency means that you are using the virtual infrastructure in the most appropriate way, and as that score starts reducing, you know that either you have virtual machines which are Over Sized or Under Sized which will lead to waste of resources or performance issues due to resource contention.
Below is a screenshot which shows how the efficiency badge and the sub-badges show in the vCenter Operations Dashboard.
At the end of the day, efficiency is the most important piece of information which the Capacity Management feature of vCOPS provides. From the perspective of an IT buyer, it becomes a tool which helps you to ensure that you do not waste any resources in your infrastructure by following primitive methods of resource allocation to servers and applications.
Hence, this allows you to right size your infrastructure as you operate and manage it.
For example, a new application which needs to be deployed in your infrastructure needs a Windows 2008 R2 VM, with 4 vCPU and 16 GB of RAM as per the application owner. This might be a practice which is being carried forward by the application owner from the world of physical servers. However, with Virtual it is quite possible that the VM will never use the allocated capacity. The challenge is that how can we capture this data and present it back to the application owner.
vCOPS has the answer – Once this machine is created and the server goes into production, vCOPS would start monitoring this virtual machine on a regular basis and would capture data around utilization of CPU & Memory. After a period of 30 to 45 days, vCOPS would understand the capacity utilization patterns of this virtual machine. After this, a report in vCOPS about Reclaimable Waste will easily tell you about all the virtual machines which are over-sized on CPU or Memory. On the basis of this report you can reclaim the resource and save a lot of money for your organization by increasing the efficiency of the hardware.
While I say this, it is important that you have the correct settings to monitor the utilization capacity and usage patterns of your virtual infrastructure. In a business environment where the servers work between 9 AM to 6 PM, Monday to Friday, it is important that you capture the utilization patterns during this period to calculate the reclaimable waste and density of the Virtual Machine. In such a scenario if you set the monitoring days and time to 24/7, you will end up capturing a lot of skewed data which does not reflect the correct business cycles. This will ultimately result in a very low efficiency and a huge amount of reclaimable waste which might not be TRUE otherwise.
To avoid such problems, follow the settings on the screenshot mentioned below and you should be good to go.
This would ensure that you capture the right data and process it into valid information which will help you manage capacity in your Virtual Infrastructure. It’s important that any decision regarding capacity is not taken in a haste. Rather, we should ensure that we customize the settings for monitoring capacity on the basis of our own environment and then let the tool run for a period of 4 weeks to 6 weeks before you start looking into the results and begin to make changes for the betterment of your Virtual Infrastructure.
Hope this will help you learn more about what vCOPS can do for you and how you can do such tasks accurately.