Chapter 7: Additional Metrics – The Definitive Guide to IT Service Metrics

CHAPTER 7: ADDITIONAL METRICS

A wide range of operational infrastructure systems, resources, capabilities, and processes may be part of IT service delivery. The focus of this chapter is the processes, outside of the service management lifecycle, that support the business and the service provider. The information in this chapter provides metrics for those additional processes that support the provision of services. The metrics sections for this chapter are:

Project management metrics

Risk management metrics

Data center metrics.

Project management metrics

Metrics most commonly applied in project management are based on Earned Value Management (EVM) for monitoring and controlling cost and schedule performance. These are standard measurements that provide the means of objectively determining variances from planned performance, evaluating the root causes, and gauging corrective actions.

Other important service performance measurements used for managing service projects (quality, risk, etc.) are provided throughout the book in sections that address specific service lifecycle processes. The metrics in this section are:

  •   Cost Variance (CV)
  •   Cost Variance Percentage (CV %)
  •   Cost Performance Index (CPI)
  •   Cost Estimate at Completion (Cost EAC)
  •   Estimate To Complete (ETC)
  •   Percent Complete (% Complete)
  •   Percent Spent (% Spent)
  •   Schedule Variance (SV)
  •   Schedule Variance Percentage (SV %)
  •   Schedule Performance Index (SPI)
  •   Schedule Estimate at Completion (Schedule EAC).

Metric name

Cost Variance (CV)

Metric category

Program/project management

Suggested metric owner

Program/project manager

Typical stakeholders

IT operations manager, service manager

Description

This is the standard measurement for cost performance as measured by comparing the budgeted value of work accomplished with the actual costs incurred for accomplishing that work.

Measurement description

Formula:

Note

The CV is a value reported conventionally in monetary terms but can be stated in labor hours or other units of production. It is also expressed as a percentage derived from comparison to the approved Performance Measurement Baseline (PMB).

Frequency

The CV is calculated for each period (i.e. monthly) and cumulatively applying the accrued (i.e. to date) BCWP and ACWP values.

Measured:

Monthly

Reported:

Monthly

Acceptable quality level: +/- 10% Variance from the approved PMB

Range:

> +/- 10% unacceptable (project costs possibly out of control)

= 90% to 110% of budgeted costs acceptable (maintain/improve)

< +/- 10% exceeds (indicates project costs are under control)

Metric name

Cost Variance Percentage (CV %)

Metric category

Program/project management

Suggested metric owner

Program/project manager

Typical stakeholders

IT operations manager, service manager

Description

This is the cost performance measurement yielded by the ratio of CV to EV. The metric is CV as a percentage of the overall budget performance to date. The BCWP or EV value applied can be monetary and/or labor hours as a commonly used valuation that reflects the approved budget.

Measurement description

Formula:

Frequency

CV % is reported monthly across all project lifecycle phases to provide insight to the magnitude of periodic and cumulative budget risk exposure.

Measured:

Monthly

Reported:

Monthly

Acceptable quality level: +/- 10% of budgeted costs

Range:

> +/- 10% unacceptable (project costs possibly out of control)

= 90% to 110% of budgeted costs acceptable (maintain/improve)

< +/- 10% exceeds (indicates project costs are under control)

Metric name

Cost Performance Index (CPI)

Metric category

Program/project management

Suggested metric owner

Program/project manager

Typical stakeholders

IT operations manager, service manager

Description

This is a measurement of cost performance efficiency for work accomplished; expressed by the ratio of the budgeted value of work accomplished to actual costs incurred.

Measurement description

Formula:

Frequency

 

Measured:

Monthly

Reported:

Monthly

Acceptable quality level: +/- 0.10

Range:

> +/- 0.10 unacceptable (project costs possibly out of control)

= 0.90 to 1.10 acceptable (maintain/improve)

< +/- 0.10 exceeds (indicates project costs are under control)

Metric name

Cost Estimate At Completion (Cost EAC)

Metric category

Program/project management

Suggested metric owner

Program/project manager

Typical stakeholders

IT operations manager, service manager

Description

This measurement is an empirically-based forecast of total costs that will be expended upon completion of a task, control account, or program/project. A conventional Cost EAC calculation is the BAC divided by the cumulative cost performance efficiency as shown by the Cost Performance Index (CPI).

Measurement description

Formula:

Note

Initially (i.e. for planning) the Cost EAC is simply the sum of cost estimates for budget formation.

Frequency

 

Measured:

Monthly

Reported:

Monthly

Acceptable quality level: +/- 10% of BAC

Range:

> +/- 10% unacceptable (project costs possibly out of control)

= 90% to 110% acceptable (maintain/improve)

< +/- 10% exceeds (indicates project costs are under control)

Metric name

Estimate To Complete (ETC)

Metric category

Program/project management

Suggested metric owner

Program/project manager

Typical stakeholders

IT operations manager, service manager

Description

This measurement is a forecast of the costs that will be incurred for remaining work based on the current level of cost performance efficiency. The ETC calculation considers the cumulative actual costs incurred to date in order to reflect only the value of remaining work as a component of the EAC.

Measurement description

Formula:

Note: Initially (i.e. during program/project planning) EAC = ETC = BAC, which is the sum of cost estimates for budget formation.

Frequency

 

Measured:

Monthly

Reported:

Monthly

Acceptable quality level: +/- 10% variance from BAC

Range:

> +/- 10% unacceptable (project costs possibly out of control)

= 90% to 110% acceptable (maintain/improve)

< +/- 10% exceeds (indicates project costs are under control)

Metric name

Percent Complete (% Complete)

Metric category

Program/project management

Suggested metric owner

Program/project manager

Typical stakeholders

IT operations manager, service manager

Description

This metric is a measurement of performance progress to date expressed by the ratio of the budget for work accomplished to the total authorized budget for all of the planned work.

Measurement description

Formula:

Note

Combined with the Percent Spent (% Spent) this metric provides a means of quickly evaluating the performance status.

Frequency

 

Measured:

Monthly

Reported:

Monthly

Acceptable quality level: The % Complete should be in line with the cumulative program/project costs.

Range:

N/A

Metric name

Percent Spent (% Spent)

Metric category

Program/project management

Suggested metric owner

Program/project manager

Typical stakeholders

IT operations manager, service manager

Description

This metric is a measurement of budget expenditures to date expressed by the ratio of the actual costs incurred for work accomplished to the total authorized budget for all of the planned work.

Measurement description

Formula:

Note

Combined with the % Complete this metric provides a means of quickly evaluating the performance status.

Frequency

 

Measured:

Monthly

Reported:

Monthly

Acceptable quality level: The % Spent should be in line with the program/project performance schedule.

Range:

N/A

Metric name

Schedule Variance (SV)

Metric category

Program/project management

Suggested metric owner

Program/project manager

Typical stakeholders

IT operations manager, service manager

Description

This is the standard measurement for schedule performance as measured by comparing the budgeted value of work accomplished with the budgeted costs of the work planned for completion in the period.

Measurement description

Formula:

Note

The SV value is reported conventionally in monetary terms, but can be stated in labor hours or other units of production. It is also expressed as a percentage derived from comparison to the approved Performance Measurement Baseline (PMB).

Frequency

The SV is calculated for each period (i.e. monthly) and cumulatively applying the accrued (i.e. to date) BCWP and BCWS values.

Measured:

Monthly

Reported:

Monthly

Acceptable quality level: +/- 10% variance from the approved PMB

Range:

> +/- 10% unacceptable (project schedule possibly out of control)

= 90% to 110% of scheduled costs acceptable (maintain/improve)

< +/- 10% exceeds (indicates project schedule is under control)

Metric name

Schedule Variance Percentage (SV %)

Metric category

Program/project management

Suggested metric owner

Program/project manager

Typical stakeholders

IT operations manager, service manager

Description

This is the schedule performance measurement yielded by the ratio of the Schedule Variance (SV) to the Planned Value (PV). The metric is SV as a percentage of the value of work scheduled to date. The BCWS or PV value applied can be monetary and/or labor hours as commonly used valuation that reflects the scheduled work.

Measurement description

Formula:

Frequency

SV % is reported monthly across all project lifecycle phases to provide insight to the magnitude of periodic and cumulative schedule risk exposure.

Measured:

Monthly

Reported:

Monthly

Acceptable quality level: +/- 10% of budgeted for scheduled work

Range:

> +/- 10% unacceptable (project schedule possibly out of control)

= 90% to 110% of planned costs acceptable (maintain/improve)

< +/- 10% exceeds (indicates project schedule is under control)

Metric name

Schedule Performance Index (SPI)

Metric category

Program/project management

Suggested metric owner

Program/project manager

Typical stakeholders

IT operations manager, service manager

Description

This is a measurement of schedule performance efficiency for work accomplished, expressed by the ratio of the budgeted value of work performed to budgeted value of work scheduled (i.e. planned).

Measurement description

Formula:

Frequency

 

Measured:

Monthly

Reported:

Monthly

Acceptable quality level: +/- 0.10

Range:

> +/- 0.10 unacceptable (project schedule possibly out of control)

= 0.90 to 1.10 acceptable (maintain/improve)

< +/- 0.10 exceeds (indicates project schedule is under control)

Metric name

Schedule Estimate At Completion (Schedule EAC)

Metric category

Program/project management

Suggested metric owner

Program/project manager

Typical stakeholders

IT operations manager, service manager

Description

This measurement is an empirically based forecast of the schedule at completion of a task, control account, or program/project. A conventional Schedule EAC calculation is the BAC divided by the cumulative schedule performance efficiency as shown by the SPI.

Measurement description

Formula:

Note

Initially (i.e. for planning) the Schedule EAC is simply the sum of cost estimates for planned work applied in budget formation.

Frequency

 

Measured:

Monthly

Reported:

Monthly

Acceptable quality level: +/- 10% of BAC

Range:

> +/- 10% unacceptable (project schedule possibly out of control)

= 90% to 110% acceptable (maintain/improve)

< +/- 10% exceeds (indicates project schedule is under control)

Risk management metrics

The risk management process includes risk management planning, risk identification, risk analysis, risk response planning and risk monitoring and control. The overall objectives of risk management activities are to increase the probability and impact of positive events, and minimize the probability and impact of negative events.

Risk management focuses on the uncertainty of future events or conditions that will impact service performance objectives if they occur. The purpose of the risk management process is to monitor risk event indicators (the risk horizon) and be prepared to respond. The metrics in this section are:

  •   Average time to conduct a risk assessment
  •   Number of risk assessments performed (in a defined period)
  •   Number of outstanding remediation actions
  •   Number of active risks identified
  •   Number of closed risks (in a defined period).

Metric name

Average time to conduct a risk assessment

Metric category

Risk management

Suggested metric owner

Risk manager

Typical stakeholders

Customers, management, financial management, IT staff, suppliers, process owners

Description

A risk assessment evaluates the situation at hand and determines the probability of occurrence and the potential impact of risk. Assessments will require different timeframes and resources to fully evaluate all potential risks based on each situation.

This metric will define the average time required to conduct a risk assessment which will help to better plan the time needed for each assessment. Several organizations who find themselves in volatile industries have dedicated risk management departments. Depending on the size of the organization and the industry, a risk management department could become stretched to the resource limit very quickly and therefore, must plan appropriately.

Measurement description

Formula:

As mentioned above, this is a valuable planning metric for risk assessments. This metric can be categorized by the different types of assessments performed to improve the overall planning

Frequency

 

Measured:

Quarterly

Reported:

Annually

Acceptable quality level: Create a baseline and monitor

Range:

Dependent on opportunities found for future services

Note

Average time should include assessment preparation time, the time for the assessment, and time after the assessment to analyze and create the risk report.

Metric name

Number of risk assessments performed (in a defined period)

Metric category

Risk management

Suggested metric owner

Risk manager

Typical stakeholders

Customers, management, financial management, IT staff, suppliers, process owners

Description

This metric is a simple counter for the number of risk assessments performed within a defined time period (i.e. annual, semi-annual). This metric can be used to justify resources, planning budgets, or for scheduling.

A risk assessment should be considered for all significant changes within the organization. For an IT service provider a risk assessment should occur for the following situations:

  •   New services added
  •   Major changes
  •   Facility changed
  •   Additional customers
  •   Mergers and acquisitions
  •   Adding or changing technology.

Measurement description

Formula: N/A

Understanding how many assessments are performed demonstrates the organization’s commitment to risk management and prevention. The information gained from each assessment creates an accumulative knowledge that can be applied across the organization.

Frequency

 

Measured:

Quarterly

Reported:

Annually

Acceptable quality level: N/A

Range:

N/A

Metric name

Number of outstanding remediation actions

Metric category

Risk management

Suggested metric owner

Risk manager

Typical stakeholders

Customers, management, financial management, IT staff, suppliers, process owners

Description

Remediation action follows the risk assessment and is used to mitigate the findings of the assessment. These actions are created to respond to the findings and eliminate the threat of the risk. From the Management of Risk® (M_o_R®), the actions are categorized as:

  •   Transfer (move the risk to another party)
  •   Tolerate (accept and do nothing)
  •   Terminate (re-scope to remove)
  •   Treat (take actions to handle the risk).

This metric ensures the remediation actions are monitored and handled promptly. This metric can be enhance with associated information from the assessment such as the risk, risk priority, assessment, and associated change record.

Measurement description

Formula: N/A

The metric can be collected from a risk register or risk log. Change management reports can add value through the documented change record associated with the risk. The change report provides evidence of the remediation efforts and the success of the actions.

Frequency

 

Measured:

Quarterly

Reported:

Annually

Acceptable quality level: N/A

Range:

N/A

Metric name

Number of active risks identified

Metric category

Risk management

Suggested metric owner

Risk manager

Typical stakeholders

Customers, management, financial management, IT staff, suppliers, process owners

Description

Once identified, risks should be documented in a risk register or risk log and monitored regularly. Many risks can be mitigated quickly while others may require ongoing attention and monitoring. In either case, these active risks should be assigned to a risk owner and managed to protect the organization from the threat posed by the risk. Active risks can include:

  •   New risks from an assessment
  •   Risks being remediated
  •   Ongoing risks that are tolerated and monitored.

This metric provides an understanding of the risks that posed an active threat to services provided to the customer. Active risks should be prioritized based on the level and extent of the finding from the risk assessment.

Measurement description

Formula: N/A

A status field should be included within the risk register or risk log that will provide the current status, such as active, of the risk. This field can be used to report and manage the current risks to the organization.

Frequency

 

Measured:

Quarterly

Reported:

Annually

Acceptable quality level: N/A

Range:

N/A

Metric name

Number of closed risks (in a defined period)

Metric category

Risk management

Suggested metric owner

Risk manager

Typical stakeholders

Customers, management, financial management, IT staff, suppliers, process owners

Description

Closed risks provide a history of risk management actions and activities. These risks contain valuable knowledge of risks identified and managed which will help improve risk management going forward.

This metric demonstrates the ongoing activities of risk management and the success of mitigation action through closed risks. We recommend that all closed risks have the associated change record documented providing evidence for closure.

Measurement description

Formula: N/A

Input for this metric can be provided from multiple resources such as risk management, security, or auditing. It doesn’t matter where the risk is identified as long as it is documented and managed properly.

Frequency

 

Measured:

Quarterly

Reported:

Quarterly

Acceptable quality level: N/A

Range:

N/A

Data center metrics

A data center is an IT services production and support environment, hosting the infrastructure required to provide services to the business. This section provides some basic measurements for evaluating and managing the data center environment.

It is important to note that this section is not intended as a data center engineering knowledge base. Rather, it provides some of the fundamental measurements that apply to the operation of data center environmental systems. The metrics presented are:

  •   Data center heating and cooling
  •   Data center primary power
  •   Data center backup power.

Metric name

Data center heating and cooling

Metric category

Data center environmental systems

Suggested metric owner

Facilities manager

Typical stakeholders

IT operations manager, service desk, change manager, process owners

Description

Heating, ventilation and air conditioning (HVAC) systems provide data center environmental temperature and humidity control. Heat transfers from hot to cold – ‘cold’ is created by removing heat and ‘hot’ is created by adding heat. Accordingly, the fundamental concept of data center HVAC operations in a data center is the evacuation of heat from the environment.

British Thermal Units (BTUs) are commonly used as the measurement of heat. A BTU is the amount of heat required to raise the temperature of one pound (0.454 kg) of liquid water (0.1198 US gallons or 0.1 UK gallon) by 1°F (0.556°C). For cooling, a commonly used term is ‘tons of cooling’ which means an air conditioner can remove 12,000 BTUs of heat per hour (e.g. a 5 ton air conditioner removes 60,000 BTUs of heat per hour). For heating, the commonly used term is BTU/h (BTUs per hour) or MBH (1,000 BTUs per hour, where ‘M’ is the Roman numeral for 1,000).

These values are important for a fundamental understanding of heating and cooling systems that provide the data center operating environment for IT service assets.

Measurement description

Formulas:

  •   1 BTU = 0.293071 Wh (Watt hours) (approximately)
  •   1 Watt (1 W) = 3.41214 BTU/h (approximately)
  •   1,000 BTU/h = 293.071 W (approximately)
  •   12,000 BTU/h (in the US) = 3.51 kW of energy (approximately)

Frequency

 

Measured:

Monitored continuously for operations, assessed incrementally to determine the impact of the heat load from IT systems

Reported:

As required

Acceptable quality level: Establish the operational baseline, monitor and control in accordance with stipulated IT systems environmental operating parameters

Range:

N/A

Metric name

Data center primary power

Metric category

Data center electrical service

Suggested metric owner

Facilities manager

Typical stakeholders

IT operations manager, service desk, change manager, process owners

Description

IT systems require a constant supply of electricity for the sustained reliable operations to meet service availability targets. The electrical power loads of IT systems are expressed as either Watts or Volt-Amps (VA). The Watts rating is the real power drawn by the equipment. The VA rating is called the ‘apparent power’” and is the product of the voltage applied to the equipment, times the current drawn by the equipment. The VA rating is always equal to or larger than the Watt rating. The ratio of the Watt to VA rating is the Power Factor (PF) and is expressed as a decimal number (e.g. 0.6) or a percentage (e.g. 60%).

The sum of the all data center equipment (IT systems and infrastructure systems) power ratings is used to determine the primary electrical capacity requirements. Generally, three-phase electrical service is standard for commercial facilities and single-phase electrical service is standard for residential facilities.

Measurement description

Formulas:

Electrical calculations

Volts x Amps = Watts or VA

Volts = Watts / Amps

 

Amps = Watts / Volts

KVA (kilovolt-amps) = Volts x Amps /

1000

KW (kilowatts) = KVA x PF

Watts = VA x PF

Amps = (VA x PF) / Volts

 

Volts = (VA x PF) / Amps

Single-phase electrical

service

Watts = Volts x Amps

 

Three-phase electrical service

Watts = Volts x Amps x 1.73

 

Frequency

 

Measured:

Incrementally as part of determining the impact of the power budget required for IT systems.

Reported:

As required

Acceptable quality level: Establish the operational baseline, monitor and control within the primary electrical power available

Range:

N/A

Metric name

Data center backup power

Metric category

Data center electrical service

Suggested metric owner

Facilities manager

Typical stakeholders

IT operations manager, service desk, change manager, process owners

Description

Backup power for IT systems are in the form of an uninterruptible power supply (UPS — also termed battery backup or static UPS) and backup generators (also termed emergency generators or rotary UPS). Battery UPS units are intended for short-term primary power loss (e.g. 15 minutes or less), while generators are for sustaining operations for long-term power outages. The sizing of these backup power sources is critical for ensuring data center COOP and the availability of services supported by the data center.

UPS systems may have both Watt ratings and VA ratings. It is an industry standard that the Watt rating is approximately 60% of the VA rating, as the typical power factor of common loads. Therefore, it is safe to assume that the Watt rating of the UPS is 60% of the published VA rating. The conservative UPS sizing approach is to ensure the sum of the equipment load ratings (Watts) is below 60% of the UPS VA rating. This usually results in an oversized UPS, thus a longer run time than expected.

Backup generators are typically rated in kilowatts (kW) with factory testing at a stipulated PF (e.g. 0.8 PF). Accordingly, either the Watts or VA load values can be used for sizing a generator. As for UPS units, the conservative approach is to use a standard or factory PF (e.g. 0.6 to 0.8) to the total VA load.

Measurement description

Formula:

Calculating the load for UPS and backup

Watts = VA x PF

generator sizing

VA = Watts / PF

 

PF = Watts / VA

 

 

Frequency

 

Measured:

Incrementally as part of determining the impact of the power budget required for IT systems

Reported:

As required

Acceptable quality level: Establish the operational baseline, monitor and control within the primary electrical power available

Range:

N/A