Capacity Conundrum Part 1

 

–The vTrooper Report –

 

In an effort to gel up an internal billing and allocation model (GaaS – Goughing as a Service) I’ve been struggling with the concept of cost per vm.  I was asking a simple question in the twittersphere about that idea and it turned into a discussion and well…got out of hand.  I apologize for that, as this is a better format to explain. (Special thanks to @asweemer for a dumping ground)

If I had a Nickel for each VM…

At VMWorld 2009 there was a presentation in the keynote that showed the price of a vm hosted with Terramark that was $.05 per hour.  I thought  wow.   A nickel per hour.  If I had a nickel per vm/hour; How much would I have available to spend on coffee?

Then I thought wait.  I have VM’s. How much do they really cost me per hour?  Well the answer is … it depends.  Old servers with high power consumption and low density versus a new system with Intel 5500’s and packed in blades have different burn rates visible to different systems(power, cooling, depreciation).  I haven’t found a great model to break those units down to my satisfaction yet.  I need another way.

As a general practice I create in my mind some maxims that I follow in the creation of a VM.

  1. S  – 1vCPU, 1GB Ram, 1GB Net, 10GB Disk
  2. M -2vCPU, 2GB Ram, 1GB Net, 20GB Disk
  3. L – 4vCPU, 4GB Ram, 2GB Net, 40GB Disk

Seems simple enough, but it doesn’ t really generate a cost model on a consistant basis.   Hardware continues to change and each VM that consumes resources does it at different rates and times of the day.   A VM that isn’t doing anything isn’t really ‘consuming’ anything, right?  I thought I would try break it down further by creating a 4 quadrant block with two macro categories:  Compute (CPU and Memory) and I/O (Network and Disk)

cmnd

Each resource area could increase\decrease for a reason without changing the size of the original maxim it was created under.  This allows for small variations of size without having a customer yell that their bill when up by $2 this month.

The Measureable Unit

Use a unit of measure to identify the four quadrants:   vCPU : vMEM : vNET : vDISK  or C:M:N:D  .   Then overlay the VM creation to count up the units.  This way the growth of a ‘VM’ during its lifecycle can be adequately allocated back into the proper IT metric.  Using the VM creation maxims up above this may be:

  1. S  – 1:1:1:1
  2. M -2:2:1:2
  3. L – 4:4:2:4

This isn’t perfect but it at least allows for the average cpu cost to be allocated seperately from a memory, network, and disk cost.  Afterall, you don’t get to upgrade all four parts of the quadrant in the same fiscal year usually.  This also allows a way to trend an average of your cost rate per unit over a period of months and years to see which cost areas are improving.  It is an interesting metric for the business and IT.  Win-Win in my book.  Even if no-one internally ever has to pay the values back (Showback).  It also helps police which VM is consuming too much of a specific value which would skew the numbers if you simply took the cost of the esx hosts and divide by the number of VM’s.

Apples , Oranges, Lemons, and Grapes = Frutti Results

So you have a unit of measure and a type of system to match the measurement up towards over a period of time.  Here’s where the fruit cart and the horse get hooked up.

This is all very complex, why can’t I just buy the same server I have purchased for the last 5 years?

Sorry Kids. They don’t build’em like they used to.  But in todays market, the UCS system from Cisco has a new buzz to the original players of IBM, HP, and Dell.   How do you sort any of that out among the offerings, and how do you select the right platform for your new ESX System?     By the Socket !   Every system of the x86 family has them from both the Intel and AMD families.  And now that you have to pay for your hypervisor and additional tools (Capacity IQ, AppSpeed,  Nexus1000v) per socket  it matters more.  I need to squeeze the value out of those sockets.

Still staying in the upper half of the Quad;  lets measure cores and RAM as a ratio assuming dual rank 4GB Dimms and measure them to some of the standard 2 socket servers.

Standard Intel x5450

2 Socket – 4 Core – 16 Dimms (8 per socket) produces 4 cores/ 32 GB Ram

Standard Intel Nehalem  x5500

2 Socket – 4 Core – 18 Dimms (9 per socket)  produces 4 cores/ 45 GB Ram

Cisco UCS extention  on x5500

2 Socket – 4 Core – 48 Dimms (24 per socket)   produces 4 cores/ 96 GB ram

What this shows is that for every license of ESX consumed in the environment there are different amounts of memory available for a VM to use.  The approach by the UCS system allows for a much higher allowance of memory to a VM at the same licensing cost.   Sure you could buy 4 way servers and claim that the 256 GB of RAM gives the VM more allowance but in reality the vm will have ratios of contention to the vCPU and Memory within each of the 4 sockets. You can change the size of the container by moving to a 4 way,  but it won’t change the value of the ratio  for that container in regards to the cores and memory.

CPU Contention

The idea of CPU contention is becoming more visible to most administrators of virtualized environments because the desire to pack the vm’s onto a host is so strong.  If I can get 10 VM’s on a host for $5000 then getting 25 VM’s on the same host is lowering my cost per vm.  It could also be cheating your customers of the performance they paid. Especially if you have multiple vCPU’s assigned to those 25 VM’s.    This is where the ratio of VM per host becomes obsolete and vCPU/core  makes more sense.

Using the example containers above you can generate an expected number of VM’s per socket.  There is no reason to do a 1:1 ratio of cores to VM because the point of virtualization is to run more with less.  I think a good ratio to start with is 4:1 for a production VM and 16:1 for a VDI implementation:

Standard Intel x5450  -  (4 /32 GB SocketRatio)   yields 16  VM’s with a 1 vCPU/ 2GB ram configuration per socket

Standard Intel Nehalem  x5500  -  (4 /45GB SocketRatio) yields 16  VM’s with a 1 vCPU/ 2.8GB ram configuration per socket

Cisco UCS  -   (4 /96GB SocketRatio) yields 16  VM’s with a 1 vCPU/ 6GB ram configuration per socket

You can always adjust your actual deployment if these ratios don’t match up for your environment.   The expected deployment number helps determine how large the pizza slices are for the team.  Not how many slices each of them consume.  In these configurations you can see where the density of the RAM per socket (SocketRatio) of the UCS allows for much larger VM configurations before overcommitment. A nice fit for the new 64bit installations. These expected numbers of VM per socket help determine what the burn rate of a C:M:N:D value is for the CapX  spend you made.

BurnRate

To fully understand how much a VM costs, one has to look at what was spent in the CapX of the host and agree on the measuring stick to measure the C:M:N:D value of the created VM.  If a series of hosts are in service from different families and are at different parts of lifecycle there may have to be some averaging.  The SocketRatio of Cores/RAM is a consistent way to measure systems from different form factors and families and levelset the expected allocation of VM’s.  The expected allocation of VM’s for a host helps determine what density ratio is desired for vCPU:vMEM.

This is the end of Part 1 –  In Part Deux I will take a deeper dive into the Compute and I/O areas and assign a more detail cost per VM model.

Leave a Reply

Your email address will not be published. Required fields are marked *